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METHODS FOR IMMOBILIZING POLYPEPTIDES 



CROSS-REFERENCES TO RELATED APPLICATIONS 

This £5>plication claims the benefit of U.S. Provisional Patent application 
Serial No. 60/212620, filed on June 19, 2000, which is herein incorporated by refioence in 
its entirety. 

BACKGROUND OF THE INVENTION 

Field of the Invention 

This invention pertains to the field of immobilizing a polypeptide to a surface, 
and methods of using such immobilized polypeptides for proteomics and high-throughput 
screening. 

Background 

A vast number of new dmg targets are now being identified using a 
combination of genomics, bioinfoimatics, genetics, and high-throug^ut biochemistry. 
Genomics provides information on fixe genetic composition and the activity of an organism's 
genes. Bioinformatics uses computer algorithms to recognize and predict structural patterns 
in DNA and proteins, defining families of related genes and proteins. Genomics, however, 
cannot provide a complete understanding of the cellular processes that are involved in 
disease processes because such processes are mediated by proteins. Genomics alone 
provides little or no information as to, for example, the relative abundance of different 
proteins in a cell, and the types of post-translational modifications present on proteins. 

Proteomics is providing a new weapon for bridging the g^ between 
genomics and disease processes. Proteomics involves the study of proteins in biological 
sanq)les. For example^ proteomics can involve comparing the proteins present in a diseased 
cell to those in a non-diseased cell to identify disease-specific proteins. The combination of 
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proteomics with the other approaches is expected to greatly boost the numb^ of potential 
drug targets tiiat are of interest for the development of new drugs. 

The number of chemical compounds available for screening as potential 
drugs is also growing dramatically due to recent advances in combinatorial chemistry, Uie 
5 production of large numbers of organic compounds through rapid parallel and automated 
synthesis. The compounds produced in the combinatorial libraries being generated will fiir 
outnumber those confounds being prepared by traditional, manual means, natural product 
extracts, or those in the historical compound files of large pharmaceutical companies. Both 
the rapid increase of new drug targets and the availability of vast libraries of chemical 
10 compounds creates an enormous demand for new technologies which improve the screening 
process. 

The complexity of drug screening is further complicated by the need to 
identify highly specific lead compounds early in the drug discovery process. Proteins within 
a structural family share similar binding sites and catalytic mechanisms. Often, a coznpound 

1 5 that effectively interferes with the activity of one family member, as desired, but also 

interferes with other members of the same family. Cross-reactivity of a drug with related 
proteins can be the cause of low efficacy or even side efTects in patients. For instance, AZT, 
a m^or treatment for AIDS, blocks not only viral polymerases, but also human polymerases, 
causing deleterious side efifects. Cross-reactivity witibt closely related proteins is also a 

20 problem with nonsteroidal anti-inflammatory drugs (NSAIDs) and aspirin. These drugs 

inhibit cyclooxygmase-2, an enzyme which promotes pain and inflammation. However, the 
same drugs also strongly inhibit a related enzyme, cyclooxygenase-1, that is responsible for 
keeping the stomach lining and kidneys healthy, leading to common side-effects including 
stomach irritation. Using standard technology to discover such additional interactions 

25 requires a tremendous effort in time and costs and as a consequence is simply not done. The 
ability to analyze a multitude of members of a protein family or forms of a polymorphic 
protein in parallel (multitarget screening) would enable quick identification of highly 
specific lead confounds that do not exhibit undesirable cross-reactivity. 

Current technological approaches for obtaining high-throughput screening of 

30 proteins and other targets for drugs include multiweU-plate based screening systems, cell- 
based screening systems, microfluidics-based screening systems, and screening of soluble 
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targets against solid-phase synthesized drug components. For example, methods are 
available &r synthesizing potoitial drugs on a solid phase and assaying the immobilized 
drugs for ability to interact with a soluble protein or othCT target However, screening of 
soluble targets against solid-phase synthesized drug components is intrinsically limited The 
5 surfaces required for solid state organic synthesis are chemically diverse and often cause the 
inactivation or non-specific binding of proteins, leading to a high rate of false-positive 
results. Furthenmore, the chemical diversity of drug compounds is limited by the 
combinatorial synthesis approach that is used to generate the compoimds at the interface. 
Another major disadvantage of this ^yproach stems fix>m the limited accessibility of the 

10 binding site of die soluble target protein to the immobilized drug candidates. 

Attachment of the drug target, rather than the potential drug, to a solid 
svq>port has provm useful for screwing of molecules that interact with DNA. Miniaturized 
DNA chip technologies have been developed (for example, see U.S. Patent Nos. 5,412,087, 
5,445, 934 and 5,744,305) and are currently being exploited for nucleic acid hybridization 

1 5 and other assays. However, DNA biochip technology is not transferable to protein arrays 
because the chemistries and materials used for DNA biochips are not readily transferable to 
use with proteins. Nucleic acids withstand tempemtures up to lOO^C, can be dried and re- 
hydrated without loss of activity, and can be bound directly to organic adhesion layers 
siq>ported by materials such as glass while maintaining their activity. In contrast, proteins 

20 must remain hydrated, kept at ambient t^nperatures, and are very sensitive to the physical 
and chemical properties of the support materials. TTierefore, maintaining protein activity at 
the liquid-solid int^&ce requires entirely diflGarent immobilization strategies than those used 
for nucleic acids. Additionally, the proper orientation of the protein at the interface is 
desirable to ensure accessibility of their active sites with interacting molecules. With 

25 miniaturization of the chip and decreased feature sizes the ratio of accessible to non- 
accessible proteins becomes increasingly relevant and important 

For the foregoing reasons, there is a need for miniaturized protein arrays, and 
for methods of synthesizing such arrays. The present invention fulfills these and other 
needs. 
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SUMMARY OF THE INVENTION 
In one aspect the present invention provides for methods for immobilizing a 
polypeptide to a sur&ce. These methods comprise contacting a polypeptide which 
comprises an ester or thioest»» with an anchor molecule comprising a first nucleophilic 
5 group at a 2 or 3 position relative to a second nucleophilic group, wherein the ester or 
Hiioester undergoes a trans-esterification reaction with the first nucleophilic group, thus 
forming an intermediate compound in which the polypeptide is attached to the anchor 
molecule through the first nucleophilic group; and attaching the anchor molecule to a 
sur&ce. 

10 In some embodiments, the polypeptide comprising an ester or a thioester is 

obtained by use of luteins. These methods generally involve expressing a chimeric gene that 
encodes a fusion protein which comprises: a) the polypeptide, and b) an intein, or a 
functional portion thereof which is joined to the polypeptide at a splice junction at the 
amino temiinus of the intein. The carboxyl tenninus of the intein generally lacks a 

1 5 functional sphce junction. The fusion protein is contacted with a nucleophiUc compound 
which releases the polypeptide fix>m the intein at the splice jimction and forms the 
polypeptide that comprises a terminal ester or thioester. 

The present invention provides methods for forming an array of immobilized 
polypeptides. The arrays are composed of a pluraUty of polypq)tide species attached to a 

20 surface. The metiiods involve contacting membras of a population of polypeptide species, 
each of which comprises an ester or thioester, with anchor molecules that have a first 
nucleophilic group at a 2 or 3 position relative to a second nucleophilic group. The ester or 
thioester undergoes a trans-esterification reaction witii the first nucleophilic group, thus 
forming an int^mediate compound in which the polypeptides are attached to the anchor 

25 molecules through the first nucleophilic group. The intermediate compoimd can then 
undergo an intramolecular rearrangement in which the second nucleophilic group on the 
anchor molecule displaces the first nucleophilic group, thus forming a more stable bond 
between the anchor molecule and the polypeptide (e.^., an amide bond). The anchor 
molecules are then attached to a surface, if not already attached prior to the linking reaction. 

30 Each polypeptide species is attached to a separate region of the surface. 
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Also provided are arrays of immobilized polyp^tides attached to a surface. 
These arrays include at least a first poIypq}tide species and a isecond polypeptide species, 
each of which polypeptide species are: a) attached to a separate region of the surface, b) 
attached to the surface in the same orientation, and c) are folded in a secondary structure as 
5 required for a biological activity. 

The invention also provides arrays of inunobiUzed polypeptides attached to a 
surface. The surface has a plurality of surface regions, and to each sur&ce region is attached 
a polypeptide species and a polynucleotide that encodes the polypeptide species. 

Also provided by the invention are methods for screening a Ubrary of nucleic 

10 acids to identify a nucleic acid that encodes a polypeptide having a desired activity. These 
meUiods involve esqnressing a plurahty of fusion proteins, each of which is encoded by an 
e^qnression cassette that conqxrises: a) a mmiber of die library of nucleic acids, b) an intein 
coding region; and c) an open reading fiame that encodes a polypeptide that is displayed on a 
surface of a repUcable genetic package. The fusion proteins are displayed on tiie surface of a 

1 5 replicable genetic package. The replicable genetic packages are then screened to identify 
those that display a polypeptide having the desired activity. 

The invention also provides nucleic acids that include an e^qpression cassette 
that has: an insertion site at which a polynucleotide can be introduced into the expression 
cassette, an intein coding region, and an open reading frame that encodes a polypeptide that 

20 is displayed on a surface of a replicable genetic package. In some embodiments, the 

caiboxyl temunus of the intein coding region is mutated so that it does not function as a 
splice junction for intein-mediated cleavage. The introduction of a polynucleotide at the 
insertion site results in an open reading frame that encodes a fusion protein which comprises 
a polypq>tide encoded by the polynucleotide, which polypeptide is attached at its caiboxyl 

25 texminus to an amino terminus of the intein, and the surface-displayed polypeptide is 

attached to a carboxyl terminus of the intein. These expression cassettes are useful for the 
screening methods of the inventioxL 

In another aspect, the invention provides for methods for immobilizing a 
pofypeptide to a surfjEice, wherein the method comprises contacting a polypeptide which 

30 comprises an ester or thioester, with aii anchor molecule comprising a first nucleophilic 
group at a 2 or 3 position relative to a second nucleophilic group, wherein the ester or 
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fhioester iindergoes a trans-esterification reaction with the first nucleophilic group, thus 
forming an intermediate compoiind in which the polypeptide is attached to the anchor 
molecule through ib& first nucleophilic group; wh^ein said intermediate compound 
undergoes an intramolecular rearrangement in which the second nucleophilic group on the 
S anchor molecule displaces the first nucleophilic group, thus forming a bond between the 
anchor molecule and the polypqstide; and attaching the anchor molecule to a surface. 

In yet another aspect, the invention provides for methods for immobilizing a 
polypeptide to a surface, wherein the method comprises: contacting a polypeptide which 
comprises an ester or thioester, with an anchor molecule comprising a reactive group 
1 0 selected firom the group consisting of a NH2-NH-R group and an aminooxy group wherein R 
represents an anchor molecule, wherein die ester or thioester reacts with the reactive groiq>, 
thus fonniug a compound comprising a polypeptide attached to the anchor molecule through 
the reactive group. 

In another aspect, the invention provides for a kit for use in irmnobilizing one 

IS or more polypeptides containing an ester or thioester to a surface of a substrate. In certain 
embodiments, the kit includes an anchor molecule reagent for adapting the ester or thioester 
containing polypeptide to die surface, the anchor molecule having a first nucleophihc group 
at a 2 or 3 position relative to a second nucleophilic group; wherein the ester or thioester of 
the one or more pol3rpeptides undergoes a trans-esterification reaction with tiie first 

20 nucleophilic groiqi, thus forming an intermediate compound in which the polyp^tides are 
attached to the anchor molecules through the first nucleophilic group, the anchor molecule 
being adapted for attacbmmt to the sur&ce of the substrate. In other embodiments, the kit 
comprises an anchor molecule comprising a reactive group such as a hydrazine groi^ (e.g., 
HH2NH-R, where R is the anchor molecule), a hydroxylamine, or an aminooxy group, etc. 

25 In some embodiments, the kits conqnise a container for the contents of the 

kit Certain embodiments of the kit further include, for example, a DNA vector for 
introducing the ester or thioester into the polypq)tide, where the vector is ad^ted to receive 
a nucleic acid sequence encoding the polypeptide to form a est^ or thioester polypeptide 
expression vector for expressing the polypeptide as an ester or thioester polypeptide having 

30 the ester or the thioester incorporated therein; where the kit further includes a chemical agent 
for introducing into the polypeptide an ester or thioester where the kit fixrther includes 
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instructions for instructing a user to carry out methods of using the kit; where the kit further 
includes a substrate for attaching the anchor molecules thereto for immobilizing the 
polypeptides thereon; where the kit has ttie anchor molecule being suppUed attached to the 
sur&ce of (he substrate for later attaching ttie polypq)tide thereto by a user; where the kit 
5 contains said polypeptides, and where said polypeptides are supplied with said kit pre- 
coupled with said anchor molecule(s). 

BRIEF DESCRIPTTON OF THE DRAWINGS 
Figure 1 depicts a schematic of two embodiments of methods for 
immobilizing a polypeptide comprising a thioester or ester to a sur&ce. In certain 

10 embodiments, the ester or thioester is also attached to an intein. The symbol 'R represent a 
reactive groiip such as a reactive group comprising a first nucleophilic group at a 2 or 3 
position relative to a second nucleophilic groiqi; or reactive group such as a hydrazine groups 
a hydroxylamine group, or an aminooxy group, etc. The structure dOTOted A is an anchor 
molecule. The symbol ^ represents a reactive groiq), a binding surface, amino acid 

15 residue(s), etc. on the anchor molecule that are able to bind to a smface (black bar) through 
covalent and/or non-covalent (e.g., ionic bonds) interactions. The symbol Y represents a 
sutfiir or oxyg&i atom. In panel A, the anchor molecule coiiq)rising the reactive group 'R is 
already immobilized to the surface. The reactive group *R then reacts with the polypeptide 
comprising a thioester or ester to form a polypeptide that is immobilized through the reactive 

20 group *R to the immobilized anchor molecule. In panel B, the polypeptide comprising the 
reactive group 'R and ^ is initially ftee in solution. Then the reactive group 'R reacts with 
the polypeptide comprising a thioester or ester to form a polypeptide that is attached to the 
anchor molecule through 'R. Then this molecule is immobilized to a sur&ce (black bar) that 
through covalent and/or non-covalent interactions to form a polypeptide that is inunobilized 

25 to a sur£ace ^ough an anchor molecule containing reactive groups 'R and attachment group 
^ The surface can be essentially be any two- or three-dimensional surface. 

Figure 2 depicts a schematic of an embodiment for immobilizing a 
polypeptide comprising a thioester or ester to a surface. The symbols in Figures 2 and 3 are 
the same as set out above for Figure 1. In these embodiments, the polypeptide comprising a 
30 thioester or ester is contacted with an activating compoimd, as exemplified by the thiol 

reagent HS-R in Figure 2. Additional activating compounds are also described herein. The 



wo 01/98458 



PCTAfSOI/19531 



activating compound displaces the intein and the resulting molecule is then contacted with 
the anchor molecide that is free in solution* The polypeptide is then attached to the anchor 
molecule through an ester or thioest^ bond. The anchor molecule is then affixed to the 
surface as set out in Figure 1. 

5 Figure 3 depicts a schematic of a variant of the embodiment depicted in 

Figure 2. In these embodiments, the anchor molecule is already immobilized to a surfiu^e 
through ^R. 

DETAILED DESCRIPTION 

Definitions 

10 A *^rotein" or '^polypeptide" means a polymer of amino acid residues linked 

together by amide bonds. Typically, as used herein, the terms refel to a polymer that is of a 
length greater than that which is readily synthesized chemically using stepwise addition of 
amino acids. Thus, a "polypeptide" or **protein" generally has at least about 50 amino acids, 
and more preferably is at least about 60, 75, or 100 amino acids in length. A '^polypeptide,^* 

15 as the term is used herein, includes without limitation, a '"protein," a **polyamino acid," a 
''peptide," etc. A "polypeptide" typically has a biological activity (e-g-., binding a target 
molecule, enzymatic activity) or other feature that is dependent upon the "polypeptide" 
folding into a particular secondary and/or tertiary structure. A "polypeptide" can be 
naturally occuning, recombinant, or synthetic, or any combination of these. A 

20 '^polypeptide" can also be just a fragment of a naturally occurring **polypeptide" or peptide. 
A "polypeptide" can be a single molecule or can be a multi-molecular complex. The term 
"polypeptide" can also apply to amino acid polymers in which one or more amino acid 
residues is an artificial chemical analogue of a corresponding naturally occurring amino add. 
An amino acid polymer in which one or more amino acid residues is an "unnatural" amino 

25 add, not corresponding to any naturally occuning amino add, is also encompassed by the 
use of the term '"polypeptide**" herein. 

The term "antibody** means an inmiunoglobulin, whether natural or wholly or 
partially synthetically produced. All derivatives thereof which maintain specific binding 
ability are also included in the temi. The temi also covers any "polypeptide** having a 

30 binding domain which is homologous or largely homologoiis to an inmiunoglobulin binding 
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domain. These **polypeptide"s can be derived from natural sources, or partly or wholly 
synthetically produced. An antibody can be monoclonal or polyclonal. The antibody can be 
a member of any immunoglobulin class, including any of the human classes: IgG, IgM, IgA, 
IgD, and IgE. Derivatives of the IgG class, however, are preferred in the present invention. 
5 The term "antibody fiagmenf refers to any derivative^of an antibody which is 

less than full-length. Preferably, the antibody fragment retains at least a significant portion 
of the full-length antibody's specific binding ability. Examples of antibody Segments 
include, but are not limited to. Fab, Fab', F(ab')2, scFv, Fv, dsFv diabody, and Fc fragments. 
The antibody firagment can be produced by any means. For instance, the antibody fi^agmait 

10 can be enzymatically or chemically produced by fragmentation of an intact antibody or it can 
be recombinantLy produced fcom a gene encoding the partial antibody sequence. 
Alternatively, the antibody firagmrat can be wholly or partially synthetically produced. The 
antibody firagment can optionally be a single chain antibody firagment Alternatively, the 
fiiBgment can comprise multiple chains which are linked together, for instance, by disulfide 

15 linkages. The fii^igment can also optionally be a multimolecular complex. Afunctional 
antibody fi:agment will typically comprise at least about 50 amino acids and more typically 
will comprise at least about 200 amino acids. 

Single-chain Fvs (scFvs) are recombinant antibody firagments consisting of 
only the variable Ught chain (Vl) and variable heavy chain (Vn) covalently comxected to one 

20 another by a polypeptide linker. EitherVLor Vh can be the NH2-teraiinal domain. The 
polypeptide linker can be of variable length and composition so long as the two variable 
domains are bridged without serious steric interference. Typically, the linkers are comprised 
primarily of stretches of glycine and serine residues with some glutamic acid or lysine 
residues interspersed for solubiUty. 

25 *T)iabodies" are dimeric scFvs. The components of diabodies typically have 

shorter peptide linkers than most scFvs and they show a preference for associating as dimers. 

An *Tv" fragment is an antibody firagment which consists of one Vh and one 
Vl domain held together by noncovalent interactions. The term "dsFv" is used herein to 
refer to an Fv with an engineered intermolecular disulfide bond to stabilize the Vh-Vl pair. 
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A "F(ab')2" fragment is an antibody fragment essentially equivalent to that 
obtained from immiinoglobulins (typically IgG) by digestion with an enzyme pq}sin at pH 
4.0-4.5. The fragment can be recombinantly produced. 

A *Tab'" fragment is an antibody fragment essentially equivalent to that 
5 obtained by reduction of the disulfide bridge or bridges joining Jhe two heavy chain pieces in 
the F(ab')2 fragment. The Fab' fragment can be recombinantly produced. 

A *Tab" fragment is an antibody fragment essentially equivalent to that 
obtained by digestion of immunoglobulins (typically IgG) with the enzyme papain. The Fab 
fragment can be recombinantly produced. The heavy chain segment of the Fab fragment is 
10 the Fc piece. 

An "array" is an arrangement of entities in a pattern on a substrate. Although 
the pattern is typically a two-dimensional pattern, the pattern can'aliso be a three-dimensional 
pattern. An array of polypeptide species refers to at least two different species of polypeptide 
that are attached to a support. An "array" includes a plurality of microparticles^ wherein 

15 each microparticle displays at least one different polypeptide as compared to another 

miotyparticle in the array. An "array" can include a plurality of r^licable genetic packages. 

^^Microparticles" suitable for use as substrates or supports in the practice of 
the present invention may be selected from, according to circumstances, the group including 
beads, resins, and particles, used in ch^nical synthesis processes, isotropic and anisotropic 

20 particles, and cylinders, including stacked cylinders and/or taggants including microfiber 
bundles, wh^ such particles may be made from substrate materials described elsewhere in 
ads disclosure or known to those of ordinary skill in the art as suitable for use as a substrate 
as described herein, organisms and their remains such as diatoms, bacteria, spores, and yeast, 
where such microparticles range in size between 1 millimeters (mm) to 1 nanometers(nm), 

25 preferably from 100 micrometers (\an) to 100 nna, more preferably between 10 ^m to 100 
nm, and are capable of being functionalized in a manner suitable for use as a substrate in the 
practice of tiie present inventioiL 

The term "coating" means a layer that is either naturally or synthetically 
formed on or appUed to the surface of the substrate. For instance, exposure of a substrate, 

30 such as silicon, to air results in oxidation of the e3qK>sed surface. In the case of a substrate 
made of siUcon, a silicon oxide coating is formed on the surface upon exposiure to air. In 

10 
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other instances, the coating is not derived from the substrate and may be placed upon the 
surfac via mechanical, physical, electrical, or chemical means. An example of tfiis type of 
coating would be a metal coating that is s^plied to a silicon or polymer substrate or a silicon 
nitride coating that is ^iplied to a silicon substrate. Although a coating may be of any 
5 thickness, typically the coating has a thickness smallo* than th^ of the substrate. A substrate 
suitable for use in the present invention may be part of a medical device, for example, a stent 
or s^pliance placed within a patient, where it is desired to have oriented display of one or 
more compounds from such substrate. 

An ''interlayer" is an additional coating or layer that is positioned between the 

10 first coating and the substrate. Multiple interlayers may optionally be used togedier. The 
primary purpose of a typical interlayer is to aid adhesion between the first coating and the 
substrate. One such exanq)le is the use of a titanium or chromiufn'interlayer to help adhere a 
gold coating to a silicon or glass surface. However, other possible functions of an interlayer 
are also anticipated. For instance, some interlayers may perform a role in the detection 

IS system of tiie array (such as a semiconductor or metal layer between a nonconductive 
substrate and a nonconductive coating). 

An "organic thinfilm" is a thin layer of organic molecules which has been 
applied to a substrate or to a coating on a substrate if present Organic thinfilms and 
methods for making oiganic thmfilms are known in the art and include, without limitation, 

20 those described in Wagner et al. USSN 09/353,555, filed July 14, 1999, vdiich is herein 
incorporated in its entirety for all purposes and for the purpose of teaching surface 
chemistries and organic thinfilms. Typically, an organic tiiinfilm is less than about 20 nm 
thick. Optionally, an organic thinfilm may be less than about 10 nm thick. An organic 
thinfilm may be disordered or ordered. For instance, an organic thinfilm can be amorphous 

25 (such as a chemisorbed or spin-coated polymer) or highly organized (such as a Langmuir- 
Blodgett film or self-ass^bled monolayer). An organic thinfilm may be heterogeneous or 
homogeneous. Organic thinfilms which are monolayers are preferred. A lipid bilayer or 
monolay^ is a preferred organic thinfilm. Optionany, the organic thinfilm may conq)rise a 
combination of more than one form of organic thinfilm. For instance, an organic thinfilm 

30 may con:q]rise a lipid bilayer on top of a self-assCTibled monolayer. A hydrogel may also 
compose an organic thinfilm. The organic thinfilm will typically have fimctionalities 
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e^Eposed on its surface which s^ve to enhance the surface conditions of a substrate or the 
coating on a substrate in any of a number of ways. For instance, e7q)osed functionalities of 
the organic thinfihn are typically useful in the binding or covalent immobilization of the 
'^lypeptide"s to flie patches of the array. Alternatively, the organic thinfilm may bear 
5 functional groups (such as polyethylene glycol (PEG)) which leduce^the non-specific 
binding of molecules to the surface. Other exposed fimctionalities serve to tether the 
thinfilm to the sur&ce of the substrate or the coating. Particular fiinctionalities of the 
organic thinfilm may also be designed to enable c^tain detection techniques to be used with 
the surfece. Alternatively, the organic thinfilm may serve the purpose of preventing 

10 inactivation of a '^polypeptide" immobilized on a patch of the array or analytes which are 
''polypeptide's firom occurring upon contact with the siurface of a substrate or a coating on 
the surfiice of a substrate. 

A '^monolay^" is a single-molecule thick organic thinfilm. A monolayer 
may be disordered or ordered. A monolayer may optionally be a polymeric compound, such 

15 as a polynonionic polymer, a polyionic polymer, or a block-copolymer . For instance, the 
monolayer may be composed of a poly(amino acid) such as polylysine. A monolayer which 
is a self-assembled monolayer, however, is most preferred. One &ce of the self-assembled 
monolayer is typically composed of chemical fimctionaUties on the termini of the organic 
molecules that are chemisorbed or physisorbed onto the surface of the substrate or, if 

20 present, the coating on the substrate. Examples of suitable functionalities of monolaym 
include the positively charged amino groiq>s of poly-L-lysine for use on negatively charged 
sur&ces and thiols for use on gold sur&ces. Typically, the odier face of the self-assembled 
monolayer is exposed and may bear any number of chemical functionalities (end groups). 
Preferably, the molecules of the self-assfflibled monolayer are highly ordered. 

25 The term "fusion protein" refers to a protein composed of two or more 

polypeptides that, although typically imjomed in their native state, are joined by their 
respective amino and carboxyl termini through a peptide linkage to form a single continuous 
polypeptide. It is understood that the two or more polypeptide components can either be 
directly joined or indirectly joined through a pq)tide linker/spacer. 

30 *Troteomics" means the study of or the characterization of either the 

proteome or some firaction of the proteome. The **proteome" is the total collection of the 

12 
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intracellular protems of a cell or population of cells and the proteins secreted by the cell or 
population of cells. This characterization most typically includes measurements of the 
presence, and usually quantity, of the proteins which have been expressed by a cell. The 
function, structural characteristics (such as post translational modification), and location 
S within the cell of the proteins can also be studied. *Tunctional proteomics'' refers to the 
study of the functional characteristics, activity level, and structural characteristics of the 
protein expression products of a cell or population of cells. 

The practice of this invention can involve the construction of recombinant 
nucleic acids and the e?q)ressionofg6nes in transfected host ceUs. Molecular cloning 

1 0 techniques to achieve these aids are known in the art A wide variety of cloning and in vitro 
an4)lification methods suitable for the constmction of recombinant nucleic acids such as 
expression vectors are well-known to pearsons of skilL Examples' of these techniques and 
instructions sufficient to direct persons of skill througih many cloning exercises are found in 
Berger and Kimmel, Guide to Molecular Cloning Techniques^ Methods in Enzymology 

15 volume 152 Academic Press, Inc., San Diego, CA (Berger); and Current Protocols in 
Molecular Biology^ F.M. Ausubel et al,, eds., Ciurent Protocols, a joint venture between 
Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (2000 Supplement) 
(Ausubel). 

Description of the Preferred Embodiments 

20 The invention provides fer methods of immobilizing a polypq)tide to a 

surface, arrays of such polypeptides, and kits for inunobilidng a polypeptide to a surface, 
etc. The immobilized polypeptides of the invmtion provide significant advantages over 
previously available immobilized polypeptides and the methods for forming them. 
Previously available me&ods for producing polypeptide arrays required either step-wise 

25 synttiesis of ttie jwlypeptide while immobilized on the surface, or nonspecific cross-linking 
to the support of functional groups present on side chains of amino acids present in a 
particular polypeptide. Both methods have significant disadvantages. St^wise synthesis 
on a surfece (e.g., a chip) is limited by the efficiency and accuracy of the available synthetic 
methods of peptide synthesis. As a practical matter, peptide synthesis methods are limited to 

30 peptides of about 60 amino acids and less. Moreover, it can be difficult or impossible to 

13 
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obtain pioper secondaiy and tertiary structure of a protein that is synthesissed by step- wise 
pq)tide synthesis. 

Cross-linking functional groups on a polypeptide to a reactive groiq> on a 
surface^ the other m^or methods for immobilizing polypeptides on a surface is often 
S problematic. An example of such methods involves the formation of a disulfide cross-link 
between cysteine residues present in the polypeptide and an immobilized thiol-containing 
group. Because the amino acid with the corresponding functional group can be found at 
multiple locations within a polypeptide, and/or can be present near a site necessary for 
biological activity of the polypeptide, cross-linking at all such sites can interfere with or 

10 even eliminate the biological activity. 

Unlike previously available methods for forming polypeptide arrays, the 
methods of the present inv^tion permit a polypeptide to be attached to a surface using a 
single discrete attachment point on the polypeptide. While the previous methods generally 
result in a polypeptide being attached to the surface at several amino acid residues (e.g-., each 

1 5 cysteine residue present in the protein), the methods of the invention allow one to attach a 
polypeptide to a surface at a discrete point (e.g., its carboxy tenninus). Thus, one can obtain 
arrays in which each polypeptide is identically oriented. The abiliQr to attach one or more 
polypq)tides in a single orientation and with only one attachment point greatly increases the 
abiUty to screen potential therapeutic or other agents for ability to interact widi the 

20 polypeptides in the array. 

The methods of the invention involve fimctioiializing a polypeptide with an 
ester or thioester at the point of desired attachment (e.^., the carboxy terminus of the 
polypeptide), and reacting the ester or thioester with a molecule diat has a first nucleophiUc 
group at the 2 or 3 position relative to a second nucleophilic group. An exanq>le of a 

25 suitable molecule for this purpose is a 2-aminonucleophile, such as a 2-aminothiol. This 
nucleophilic molecule can be used to attach the polypeptide to a solid support. The ester or 
thioester and the first nucleophilic groi^> of the compound undergo a transesterification 
reaction, thus producing an intermediate in which the polypeptide is linked to the compound 
by an est^ or thioester bond. The intermediate then undergoes a spontaneous rearrangement 

30 to form a more stable bond between the polypeptide and the second nucleophilic group on 
the compound. In other embodiments, the ester or thioester containing polypeptide is 

14 
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immobilized by contacting the p lypeptide with an anchor molecule containing a reactive 
group such as a hydrazine group, a hydroxylamine, or an aminooxy group, etc. 

In certain embodimmts, the thioester- or ester-containing polypeptide to be 
immobilized also comprises an intein, an intein fiagment, or a mutated intein, etc (see e.g., 
5 Fig. 1), These intein-containing polypeptide are then reacted with a reactive group on the 
anchor molecule that is pre-immobilized to a sur&ce or is subsequently immobilized to a 
sur&ce (see, e.g.. Fig. 1). In other embodiments, an activating compound is contacted with 
the polypeptide comprising an intein, an intein fragment, or a mutated intein (see Figs. 2 and 
3) prior to contact with the anchor molecule comprising a reactive groiq>. The intein 
10 chemistry, anchor molecules, activating compounds, and reactive groups will be described in 
more detail below. 

A» Derivatization of FolypepUdes 

The polyp^tide arrays of the invention are made by introducing an ester 
group into the polypeptide at a specific position, generally at the carboxyl temunus of the 
15 polypeptide, and using this group to attach the polypeptide to a support. The ester group, as . _ 
the term is used herein, can be any type of ester, including thioesters and the like, in addition 
to alcohol-derived esters. 

1. Chemical derivatization 

The derivatization to introduce the ester or thioester group into the 
20 polyp^tide can be accomphshed in any of several ways. For exanqile, chemical synthesis 
methods can be used to make a suitably derivatized polypeptide. Such methods are 
generally useful for relatively short polypeptides. One suitable method involves step-wise 
synthesis of a peptide on a resin that has an unoxidized thiol. Hie thiol is reacted with a 
protected amino acid succinimide to produce an aminothioest^ resin. The p^tide is then 
25 synthesized on the resin, after which it is released with an appropriate compound to produce 
the desired peptide with a C-temiinal thioester (jree, WO 96/34878). 

Oiemical ligation provides another means by which a synthetic fragmmt 
(e,^., which contains an ester or thioester) can be joined to a polypqptide of interest (Dawson 
et al (1994) Science 266: 776-779; Tarn et al (1 995) Proc Nat 'i Acad. Set USA 92: 
30 12485-12489; Canne et al (1996) J. Anu Chem. Soc. 118: 5891-5896; and Wilken and Hart 
(1998) Curr. Op. BiotechnoL 9: 412-426). For example, native chemical Kgation involves 
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the chemical ligation of an unoxidized N-teiminal cysteine on a first polypeptide to a C- 
terminal fhioester of a polypeptide of interest A P-thioester intermediate is formed in which 
the first polypeptide is linked to the C-tenninus of the polypeptide. This intermediate 
undergoes a spontaneous intramolecular rearrangement, which results in the two molecules 
5 becoming linked by an amide bond {see, e.g,, WO 96/34878)* A catalytic tiiiol can be 

included in the reaction mixture. Native chemical ligation can be used, for example, to link 
a polypeptide that is derivatized to facilitate attachment to a solid support to a polypeptide of 
interest for analysis* The native diemical ligation reaction can be conducted before 
attaching the attachmrat polypeptide to a surface, or after attachment has occurred. 

10 2. Intein-mediated derivatization 

In some embodiments, the polypeptide having an ester is obtained vsing 
inteins, which are also known as ^^protein introns," ^Intervening protein sequences," "protein 
spacers,*' and the like. luteins are somewhat analogous to introns found in mRNA 
molecules. As is the case for introns, inteins are spliced out of the respective polypeptide, 

15 resulting in joining of the portion of the polypeptide N-terminal to the lutein (the **N-extein") 
with the polypeptide portion that is to the C-tenninal side of the intein (the "C-extein**). The 
splicing reaction involves an acyl rearrangement between the S or O side chain of a cysteine, 
threonine or serine residue at the N-terminal of the intein with the peptide bond which 
connects the Cys, Thr or Ser residue to the N-extein. 

20 This rearrangement results in an intermediate in ^Aiiich the N-cysteine (or Ser 

or Thr) is attached to the adjacent extein by a thioest^ or ester, respectively. This 
intermediate thsa undergoes a trans-estmfication reaction due to nucleophilic attack by an O 
or S-containing side chain of a Cys, Ser or Thr residue at the C-terminal end of the intein. 
This forms a branched polypeptide intermediate in which the N-extein is joined to a side 

25 chain of the Cys, Thr or Ser of the C-extein by a thioester or GStet linkage. The intein is then 
released by cyclization of a conserved Asn reddue at the carboxy end of the intein to form a 
succinimide derivative, foUowed by an O-N or S-N acyl shift and concomitant hydrolysis of 
the succinimide. The mechanisms of intein cleavage are discussed in, for example, Chong et 
al. (1998) Gene 192: 271-281; Evans et al (1998) Protein Scu 7: 2256-2264; and Paulus 

30 (1998) Chem. Soc. Reviews 27: 375-386. 
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Inteins are described in, for exatt^)le, U,S. Patoit Nos. 5,981,182, and 
5,834,247, which are herein incorporate by reference in their entirety for all purposes and for 
the purpose of teaching inteins and intein chemistry. Inteins generally include amino acid 
residues that are conserved among inteins of different proteins. Intein motifs are described 
5 in, for example, Pietrokovski, (1994) Protein Science 3:2340-2350; Perler et al (1997) 
Nuc, Acids Res, 25:1087-93; Pietrokovski, S. (1998) Protein Sci. 7:64-71 . Other methods of 
idratifying inteins are described in, for example, Dalgaard et al, (1997) Computational 
Biol 4:193-214 and Gorbalenya, A. E. (1998) Nucleic Acids Res 26:1741-8, *TNBASE" a 
cort^ilation of known inteins by New England Biolabs, is foimd at 

10 http'y/circuitneb.com/inteins/int_idhtnil . 

For use in the methods of the present invention, it is preferred to use mutant 
inteins in which only the amino-terminal end of the intein is capable of partic^>ating in the 
reaction. Such mutant inteins thus do not result in splicing of the N-extein to the C-extein. 
Instead, the N-extein is released from the intein upon attack by an activating compound that 

15 contains a nucleophilic groiq> (e.g,, a thiol or hydroxyl) under conditions conducive to intein 
cleavage. The activating compoimd then becomes attached to the end of the extein that was 
adjacent to the intein by a thioester or ester bond (see, e.g., Muir et al (1998) Proc, Nat % 
Acad. Set USA 95: 6705-6710; Severinov and Muir (1998) 7. Biol Chem, 273: 16205- 
16209; Evans et al (1998) Protein Sci. 7: 2256-2264). Suitable activating con^pounds that 

20 have nucleophilic groups include, for example, dithiothreitol (DTT), 2-mercaptoethanol, 

tfaiophenol, 2-mercaptoethanesulfonic acid, and cysteine-containing molecules, and the like. 
In some embodiments, the confounds contain 2-aminonucl6ophiles such as 2-aminothiols or 
2-aiQino alcohols. These 2-aminonucleophiles can be attached to anchor molecules, such as 
are described in more detail below, which are used for attachment of the polypeptide to a 

25 si^jport 

For some applications, the invention uses split inteins, in which the intein is 
split among two different polypeptides. The two molecules then imdergo trans-splicing to 
excise the intein portions (termed the '*n-intein" and the "c-intein") and join the two exteins. 
For izse ia the invention, the polypeptide of interest is attached to an Int-n of a split intein 
30 and a molecule to be joined to the polypeptide (e.g., an anchor molecule) is attached to an 
Int-c of a split intein. The Int-n and the Int-c undergo the trans-splicing reaction, thus 
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attaching the anchor molecule to the polypeptide. An example of a naturally occurring intein 
occurs in the DnaB polypqitide of Synechocystis^ as described in Wu et aL (1998) Proc. 
Nat'L Acad. ScL USA 95: 9226-9231 and Goibalenya (1998) NucL Acids Res, 26: 1741- 
1748. Other trans-spliced inteins also occur naturally and are likewise suitable for use in the 
S invention. An intein that» in its natural fonn» is encoded as a single polypeptide with the 
associated exteins can also be split among two expression cassettes and used as a split intein 
(see, e.g., Gimble (1998) Chemistry and Biology 5: R251-R256). 

The autoprocessing domains of hedgehog proteins are also useM for 
obtaining polypeptides that have an ester or thioester at its caifooxyl tenninus. These 

10 autoprocessing domains are similar to inteins, both in their structure and in their amino acid 
sequences- See, Porter et al. (1996) Cell 86: 21-34; Duan et aL (1997) Cell 89: 555-564; 
Haa etoL (1997) Cell 91: 85-97. 

The use of split inteins in the methods of the present invention is particularly ~ 
advantageous for attaching polypeptides that have disulfide bonds. Other attachment 

15 me&ods, e.g., attachment to sulfide groups and the like, often result in disnq)tion of the 
naturally occurring disulfide bonds that occur in the polypeptide. Through use of a split 
intein, the joining of the anchor molecule is accomplished by intein-catalyzed splicing. 

Generally, fiision proteins in which apolypqjtide of interest is attached to a 
mutant intein are obtained by recombinant methods. A chimeric nucleic add is constructed 

20 in which a polynucleotide fbzt codes for the polypq)tide of interest is upstream o^ and in 
fiame with, a coding region for an intein* Because intein-mediated cleavage is somewliat 
depradent upon the amino acid present at the end of the polypeptide of interest, the chimeric 
nucleic acid also can include one or more codons that add one or more amino acids which 
facilitate intein-mediated cleavage to the end of the target polypeptide. Examples of suitable 

25 amino acids for cleavage are described in, for example. New England Biolabs catalog 
entitled *TMPACTW-CN" (Beverly, MA). The chimeric nucleic acid is then expressed, 
resulting in biosynthesis of the fusion protein. The fusion protein is subjected to the 
cleavage reactions discussed herein to release the polypeptide of interest having an ester or 
thioester attached to the C-terminus. The polypeptide can then be attached to a surface as 

30 described herein. 
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The construction of suitable chimeric nucleic acids is facilitated by the use of 
an expression cassette. An "expression cassette" is a nucleic acid construct, generated 
recombinantly or synthetically, that has nucleic acid elements that are capable of effecting 
expression of a structural gene in host cells or other systems compatible with such 
5 sequences. Expression cassettes include at least promoters and option^y» transcription 
termination signals. Typically, a recombinant expression cassette includes a nucleic acid to 
be transcribed (e,g., a nucleic acid encoding a desired polypeptide), and a promoter. 
Additional factors necessary or helpful in effecting expression can also be used. For 
example, an expression cassette can also include nucleotide sequences that encode a signal 

1 0 sequence that directs secretion of an expressed protein from the host cell. Transcription 
tennination signals, enhancers, and other nucleic acid sequences that influence gene 
expression, can also be included in an expression cassette. 

In some embodiments, the expression cassette can also include a coding 
region for a tag that can noncovalently associate with a binding partner. Such tags are useful 

15 in the ptirification of the resulting polypeptide by a£5nity binding prior to immobilization on 
the array. Tags can also be used to attach the polypeptides to the surface to form the arrays, 
as discussed in more detail below. The tag coding region is typically preset downstream of, 
and in fiame with, the intein coding region. Upon ^q)ression, the protein can then be 
afBnity purified using the tag, after which the intein-mediated cleavage releases the tag from 

20 the polypeptide to be immobilized. 

Examples of suitable tags which are proteins include the binding domains of 
glutathione-S-transferase (GST), maltose-binding protein, chitinase (e.g., a chitin binding 
domain), cellulase (cellulose binding domain), thioredoxin, and the like. If the protein of 
interest an antibody or antibody fragment comprising an Fc region, then the tag may 

25 optionally be protein G, protein A, or recombinant protein A/G (a gene fusion product 

secreted from a non-pathogenic form of Bacillus which contains four Fc binding domains 
from protein A and two from protein G). Other examples of suitable fusion tags include T7 
tag, S tag. His tag, PKA tag, HA tag, c-Myc tag, Trx tag, Hsv tag, Dsb tag, pelB/ompT, KSI, 
VS V-G tag, and p-Gal tag. A fusion protein that includes green fluorescent protein (GFP) or 

30 other proteins that can be visualized or can participate in a reaction which forms a detectable 
con:^)ound can be used for quantification of siur&ce binding. 

19 
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Examples of tag/tag binder pairs include, but are not limited to, the following: 



Fusion tags 


Tag binders 


Hi8tidine(6-8 His) 


NTA (Nitrilotriacetic acid, with a metal 
such as Ni, Co, Fe, Cu) 


GST (220 aa) 


GSH (Glutathione,^ amino acids) 


o Uwuuuv ^^■^ "mill*' d^ix^oj 


S (104 aa) 


TTf A TiPTifiHp nmirin aciHr* T*mtein l^itiase 

Inhibitor (PKT) peptide) 


PKA 


HA peptide (9 amino acids) 


HA 


OligoPhenylalanine, or OligoLeudne (10-30 
amino acids) 


KSI(125aa) 




OligoGlutamic acid (10-15 amino acids) 


Asp (6-10 Asp) 


OligoArginine (10-15 amino acids) 


MBP(360aa) 


Maltose 


GBD 


Galactose 


CBD (107-156 aa) 


Cellulose 



Methods for constructing and expressing genes that encode fusion proteins 



are well known to those of skill in the art Examples of these techniques and instructions 
sufficient to direct persons of skill through many cloning exercises are found in Berger and 
5 Kimmel, Gtdde to Molecular Cloning Techniques, Methods in Enzymology 152 Academic 
Press, iic, San Diego, CA <B«*ger); Sambrook et al (1989) Molecular Cloning - A 
Laboratory Manual (2nd ed) VoL 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor 
Press, NY, (Sambrook et al.); Current Protocols in Molecular Biology, F.M. Ausubel et al., 
eds.. Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John 
10 Wiley & Sons, Inc., (2000 Supplement) (Ausubel); Cashion et al., U.S. patent number 
5,017,478; and Carr, European Patent No. 0,246,864. 

The use of inteins is particularly smtable for constructing arrays of different 
protein species, such as those obtained through use of DNA shufQing, recombination, and 
other methods known to those of skill in the art for obtaining libraries of nucleic acids that 

20 
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encode different polypeptide species. The resulting libraries of polypeptide-encoding 
polynucleotides are introduced into an expression cassette wbich includes an insertion site 
preferably one or more restriction enzyme cleavage sites) at which a member of the library 
of polynucleotides is introduced into the e7q)ression cassette. The insertion site is situated 
5 such that when a polynucleotide is introduced, at least in some fraction of cases, an open 
reading frame is formed in which the polypeptide-encoding open reading frame and that of 
the intein coding region are in the same frame. A library of cDNA molecules, genomic 
DNA fragment, polynucleotides that have been subjected to recombination, and the like, is 
ligated into the e^^ression cassette and the resulting fusion protein expressed, subjected to 

10 intein-mediated cleavage to obtain the derivatized polypeptides, and immobilized on the 
surface for screening. 

Chimeric nucleic acids that encode the polypeptide-lntein fusion proteins can 
be expressed using either in vivo or in vitro expression systems. Many suitable expression 
vectors for expression of polypeptides such as the intein-containing fusion proteins are 

15 commercially available (from Qiagen, Novagen, Clontech, and many other companies). 
Suitable expression vectors and systems specifically designed for e3q>ression of intein- 
containing fixsion proteins are commercially available from, for example. New England 
Biolabs (Beverly, MA). For in vivo egression, the vectors are introduced into cells of an 
appropriate organism which recognizes the expression control signals present in the 

20 e7q>ression cassette. Expression in vrvo can be done in bacteria (for example, Escherichia 
coli. Bacillus sp., and die like), plants (for example, Nicotiana tabacum\ lower eukaryotes 
(for example* Saccharomyces cerevisiae, Saccharomyces pombe, Pichia pastoris^ and 
filamentous fimgi), or higher eukaryotes (for example, baculovirus-infected insect cells, 
insect cells, mammalian cells). The choice of organism for optimal expression can depend 

25 on flie extent t)f post-translational modifications (i.e., glycosylation, lipid-modifications) 

desired. One of ordinary skiU in the art will be able to readily choose which host cell type is 
most suitable for the protein to be immobilized and {^plication desired. 

In other embodiments, in vitro expression systems are used. Systems have 
long been available for translation of mRNA molecules. Both eukaryotic and prokaiyotic 

30 cell-free systems are available. Eukaryotic systems include, for example, the rabbit 

reticulocyte system (Pelham and Jackson (1976) Eur. J. Biochem.^ 67: 247-256) and the 
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wheat gem lysate (Roberts and Paterson (1973) Proc. Nai'L Acad. Set USA 70: 2330- 
2334). Piokaryotic systems include tiie E. coli S30 «tract method and the fiactionated 
method described by Gold and Schweiger (1971) Meifu EnzymoL 20: 537. 

Coupled transcription and translation in vitro expression systems are 
5 particularly suitable for use in the present invention (see, e,g,, US Patent No. 5,324,637; 
Kigawa and Yokohama (1991)7. Biochem. 110:166-168; Kudlicki et al. (1992) Anal. 
Biochem. 206:389-393; and Pratt, J., "Coupled transcription-translation in prokaryotic cell- 
free systems" in Transcription & Translation: A Practical Approach^ Hames & Higgins, 
IRL Press, Chapter 7, pp. 179-209 (1987). Suitable systems include, for exan^le, 
10 Escherichia coli 830 lysates (see, e.g., Zubay (1973) Ann, Rev. Genet, 7: 267), such as, for 
exanq3le, those from strains that e7q)ress the chimeric nucleic acid under the control of a T7 
RNA polymerase promoter. Preferably, the strains are protease-deficient strain. Other 
systems include wheat germ lysates; reticulocyte lysates (see, e.g., Promega, Pharmacia, 
Panvera)). 

15 In a presently preferred embodiment, the in vitro expression is conducted 

directly on a surface to which the polypeptide is to be immobilized. This can be 
accomplished, for example, using a nanodroplet technique that has been described for 
making a miniaturized array of cell-based assays O^ou etal. (1997) Chem. Biol. 4: 969-975). 
The methods of the invention can be performed by £?)plying small droplets of a cell-free 

20 expression system to a surface. A micro tip can be used for the application of the droplets. 
If desired, the surface can be pre-coated with PDMS, polyethylene glycol, or other reagents 
known to reduce non-specific binding to a surface. 

Avoidance of evaporation during the expression is of particular in^ortance in 
the in vitro e}q>ression methods. To reduce evs^ration, one can use microchannels to apply 

25 the cell free expression systems. Suitable microchannel dispensers, and surfaces for use with 
such dispensers, are described below and in US Patent Application 09/792335, filed 
February 23, 2001 , The cell-fi^e systems can be puny)ed through microchaimels to load a 
channel above the surface to which is attached the array of polypeptides. One can load 
different chambers with cell-free expression samples that contain different templates. 

30 The invention also provides arrays in which a plurahty of polypeptide species 

are attached to a surface, along with polynucleotides that encode each of the polypeptide 
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species. Such arrays allow one to not only identify a polypeptide of interest by screening the 
array, but also idmtify the particular polynucleotide that encodes the polypeptide of interest. 
Thus, one can readily use the polynucleotide to detennine the deduced amino acid sequence 
of the polypeptide, and to express the polypeptide in quantity. 
5 The combined arrays can be made by conducting the m vitro expression 

directly on a surface to which ttie polyp^tide is to be immobilized, as described above, 
while also attaching the polynucleotide to the sur&ce. Methods for attaching 
polynucleotides to a sur&ce are known to those of skill in the art. 

3. Pre-screeoing of polypeptides prior to attachment to sxuiiace 

10 It is sometimes desirable to conduct an initial screening of a polypeptide 

library to identify those that have a particular activity prior to inunobilizing the polypq)tide 
species in an array on a siuiace. Phage display and related methods are particularly 
amoiable to such initial screening methods. A basic concept of display methods that use 
phage or other replicable genetic package is the establishnient of a physical association 

15 between DNA ^coding a polypeptide to be screened and the polypeptide. This physical 
association is provided by the replicable genetic package, which displays a polypeptide as 
part of a capsid enclosing the genome of the phage or other package, wherein the 
polypq}tide is encoded by the genome. The establishment of a physical association between 
polypeptides and their genetic material allows simultaneous mass screening of very large 

20 numbers of phage bearing different polypeptides. Phage displaymg a polyp^tide with a 
desired activity, such as affinity to a targ^ c.^., a receptor, bind to the target and these 
phage are enriched by affinity screening to the target. The identity of polypeptides displayed 
from these phage can be determined from the respective phage genomes. Using these 
methods, a polypeptide identified as having a binding affinity for a desired target can tiien be 

25 synthesized in bulk by conventional means. 

Typically, the initial screening using such methods involves expressing the 
recombinant p^tides or polypeptides encoded by the recombinant polynucleotides of a 
library as fusions with a protein that is displayed on the sur£ice of a r^licable genetic 
package. For example, phage display can be used. See, eg, Cwirla ei a/., Proc. Nat 'I Acad, 

30 Sd, USA 87: 6378-6382 (1990); Devlin et al. Science 249: 404-406 (1990), Scott & Smifli, 
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Science 249: 386-388 (1990); Ladner et al, US 5,571,698. Other replicable genetic 
packages include, for example, bacteria, eukaryotic viruses, yeast, and spores. 

The genetic packages most frequently used for display libraries are 
bacteriophage, particularly filamentous phage, and especially phage Ml 3, Fd and Fl . Most 
work has involved inserting libraries encoding polypeptides to be displayed into either gin 
or gVin of these phage forming a fusion protein. See, e,g,. Dower, WO 91/19818; Devlin, 
WO 91/18989; MacCafferty, WO 92/01047 (gene HI); Huse, WO 92/06204; Kang, WO 
92/18619 (gene VIU). Such a fusion protein conq>rises a signal sequence, usually but not 
necessarily, from the phage coat protein* a polypeptide to be displayed and either the gene III 
or gene Vm protein or a fragment thereof. Exogenous coding sequences are often inserted 
at or near the N-terminus of gene m or gene Vm although other insertion sites are possible. 

Eukaiyotic viruses can be used to display polypqitides in an analogous 
manner. For exan^)le, display of human heregulin fused to gp70 of Moloney murine 
leukemia virus has been reported by Han et aL, Proa Nat 7, Acad Sci. USA 92: 9747-9751 
(1995). Spores can also be used as replicable genetic packages. In this case, polypeptides 
are displayed from the outer surface of the spore. For example, spores from R subtilis have 
been rq)orted to be suitable. Sequences of coat proteins of these spores are described in 
Donovan et aL, J. Mol BioL 196: 1-10 (1987). Cells can also be used as replicable genetic 
packages. Polypeptides to be displayed are inserted into a gene encoding a cell protein that 
is eiKpressed on the cells surface. Bacterial cells including &/mo/2e//a typhimuritm^ Bacillus 
subtilis^ Pseudomonas aeruginosa^ Vibrio cholerae^ Klebsiella pneumonia. Neisseria 
gonorrhoeae^ Neisseria meningitidis^ Bacteroides nodosus^ Moraxella boviSj and especially 
Escherichia coli are preferred. Details of outer surface proteins are discussed by Ladner et 
aL^ US Patent No. 5,571,698 and references cited therein. For example, the lamB protein of 
E. coli is suitable. 

Once the prescreening has identified polypqitides that are of interest for 
further screening, the polypeptides can be derivatized with a C-terminal ester or thioester and 
immobilized on a surface according to the methods of the invention. The polypeptides of 
interest can be released firom the surface protein by methods known to those of skill in the 
art, such as proteolytic cleavage and the like. Chemical methods can then be used to 
accomplish the desired derivatization. 
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A more preferable way to obtain release of tiie polypeptide of interest while 
simultaneously accomplishing the introduction of a temiinal ester or thioest^ is provided by 
the invention. An intein coding region is introduced between the polynucleotide of interest 
and the coding region for the surface-displayed protein. The resulted fusion protein, when 
5 expressed, then includes the polypeptide of interest (e.g,^ a library member, and the like), the 
intein, and the phage surface-displayed protein. After expression, the initial screening is 
conducted iising the polyp^tide displayed on the phage or other repUcable genetic package. 
After idratifying those phage that display a polypeptide that has the desired activity, the 
polypeptide is released fi'om the phage simply by canying out the iatein cleavage reactions 
10 described herein. No proteolytic cleavage or other undesirable method is required. 

Moreover, the protein then has the desired ester or fhioester bond which can ser^e as an 
attachment point. 

The invention provides expression cassettes and expression vectors that 
facilitate the use of display on replicable genetic packages for initial screening, followed by 

15 intein-mediated derivatization of the polypeptide. The expression cassettes include an 
insertion site at which a member of the libraiy of nucleic adds is introduced into the 
expression cassette. The insertion site preferably includes one or more restriction enzyme 
cleavage sites. Downstream of the insertion site is an intein coding region, which in turn is 
followed by an open reading frame that encodes a polypeptide that is displayed on a surface 

20 of a replicable genetic package. The introduction of coding region for a polypeptide of 
interest, such as a member of the library of nucleic acids, at the insertion site results in an 
open reading frame that encodes a ftision protein that comprises the polypeptide encoded by 
the library member, the intein, and the surface-displayed polypeptide. 

The fiision protein is then expressed in flie appropriate system which results 

25 in the polypeptide of interest being displayed on the surface of the corresponding replicable 
genetic package. After initial screening using methods known to those of skill in the art, the 
ftision proteins that are of interest for fiuiher evaluation and/or use are subjected to intein- 
mediated cleavage and ester/thioester derivatization, followed by attachment to a sur&ce. 

The target protein/intein/surface display peptide ftision proteins are usefiil not 

30 only for preselecting polypeptides for subsequent immobilization, but are ako useftil for 

modifying a protein by adding the phage display-selected polypeptide to an end of a protein 
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of interest After selection of individual phage that display polypeptides having the desired 
biological activity (e.g,y binding activity), the polyp^tides can be subjected to intein- 
mediated cleavage to release the binding polypeptides and simultaneously introduce a 
reactive est^ or thioester group. The binding polypeptides can then be attached to a protein 
5 of interest 

B. Anclior Molecules and Attachment to Surface 

The ester- or thioester-containing polypeptides are attached to a surface by 
reacting the ester or thioester groups with an anchor molecule comprising a reactive group 
(e.g.» a functional group) that reacts with tfie ester or thioester group to attach the 

10 polypeptide to the anchor molecule. The anchor molecule can be attached to the surface 
before, after, or during reaction with the ester or thioester. 

In certain embodiments, the reactive group on the anchor molecule is a group 
that has a nucleophilic groiq) at the 2 or 3 position relative to a second nucleophilic groiq>. 
One of the nucleophilic groups is, in some ^bodiments, o to a carbonyl group. One 

15 nucleophilic group on the compound attacks the ester or thioester on the polypeptide to form 
an intermediate, which then undergoes an intramolecular rearrangement involving the 
second nucleophile on the compound The intemaediate typically involves a 5- or 6- 
m^bered ring structure. The first reaction involves the group that has the greatest 
nucleophilic character, while the second nucleophilic group generally forms a more 

20 thermodynamically and/or kinetically stable product than the first For example, a 2- 

aminonucleophile or 3*aminonucleophile compound (e.g^., 2-amihothiol or 3-aminotfaiol) can 
undergo a trans-esterification reaction with the ester or thioester on the polypeptide. This 
reaction produces an intermediate in which the polypeptide is linked to the compound by a 
2-aminonucleophile-ester bond. The resulting 2-aminonucleophile-ester bond then 

25 undergoes an intramolecular rearrangement mediated by the second nucleophilic gn>iq> on 
the compound to form an amide bond that stably links the anchor molecule to the 
polypeptide. For illustrative purposes, examples of suitable compounds that have two 
nucleophilic groups include structures such as: 
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The above structures can also have additional substitotions at one or more of the carbons, 
and can have an additional caifoon between the amine and the thiol. Examples of suitable 
nucleophilic g^o^^]S include those known to those of skill in the art, including S, N, and 
5 Se» for example. The dashed lines represent a moiety that is, or can be, attached to a sur&ce. 

In other embodiments, the reactive group on the anchor molecule is a 
nucleophilic group that can directly react with the tiiioester or ester. Examples of such 
reactive groups, include without limitation, hydrazine groups (e.g., NH2NEI-R, where R is 
the anchor molecule), hydroxylamine groups, and aminooxy groups, etc. 

10 The anchor molecules having two nucleophilic reactive groiqis or containing 

reactive groups such as a hydrazine, a hydroxlamine, or an aminooxy group, etc. can be 
either directly attachable to a surface, or can be attached to a surface by another compoimd 
with which the di-nucleophilic compound can react. For example, the di-nucleophilic 
compound can be covalently linked to the surface-attached compoimd, or can be 

IS noncovalently associated to the sur&ce-attached compoimd For example, the di- 

nucleophilic compoimd can include a functional groiq> that can form a covalent bond with a 
molecule attached to a surface. Preferably, the functional group is one that can participate in 
a chemoselective ligation reaction having little or no cross reactivity with functional groi^ss 
present in the amino acids that make up the polypq)tide being attached. Alternatively, the 

20 reactive functional groups can exert some cross reactivity if the groups are activated in 
proximity to the desired target under conditions wherein bond formation with the target is 
favored over reactivity with other sites. Examples of such reactive groups (or covalent 
linking groups) include ketones (which can react with an acyl hydrazine on a surface to form 
an acyl hydrazone), olefins (which can react with a second olefin on a surface or as part of a 

25 label in a cross olefin metathesis catalyzed by, for example, a ruthenium complex), or a 
diketone (which can react with a guanidine group). Of course, one can reverse which 
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member of the reactive pairs is attached to the sur&ce, and attach an acyl hydrazme, for 
example, to the di-nncieophilic compound and the ketone to the surface. Other covalent 
linking groups useful in the present invention include epoxides^ aldehydes, reactive esters 
(e.^., pentafluorophenyl esters, nitrophenyl esters), isocyanates and tfaioisocyanates, 
5 carboxylic acid chlorides, dissulfides and sulfonate esters {e,g, mesylates, tosylates and the 
like). Still other covalent linking groups are the suUhydiyl groups (preferably protected until 
reaction is desired^. Other suitable covalent linking groups include, but are not limited to, 
maleimide, isomaleimide, N-hydroxysucdnimide (Wagner et al (1996) BiophysicalJoumal 
70: 2052-2066), nitrilotriacetic acid (US Patent No. 5,620,850), activated hydroxyl, 

10 haloacetyl, activated carboxyl, hydrazide, epoxy, aziridine, sulfonylchloride, 

trifluoromethyldiaziridine, pyridyldisulfide, N-acyl-imidazole, imidazolecarbamate, 
vinylsutfone, succinimidylcazbonate, aiylazid^ anhydride, diazoacetate, benzophenone, 
isotfiiocyanate, isocyanate, imidoester, fluorobenzene, and the like. 

The functional group will in some embodiments be protected, or otherwise 

15 rendered inactive to covalent bond formation, by a protecting groiqi. A variety of protecting 
groups are useful in the invmtion and can be selected based on the functionality present in 
the functional group. The term **protecting groiq)** as used herein, refers to any of the groups 
which are designed to block one reactive site in a molecule while a chemical reaction is 
carried out at another reactive site. More particularly, the protecting groups used herein can 

20 be any of those groups described in Greene et aL, Protective Groups In Organic Chemistry, 
2nd Ed, John Wiley & Sons, New York, N. Y, 1 99 1 . The proper selection of protecting 
groups fer a particular synthesis will be governed by the overall methods eniployed in the 
synthesis. For example, in automated synthesis photolabile protecting groups such as 
NVOC, MeNPOC, and the like can be used. In other embodiments, protecting groups may 

25 used that are removable by chemical methods, such as FMOC, DMT and other methods 
known to those of skill in the ait 

In some embodiments, the di-nucleophilic compound is a peptide that has at 
its amino terminus a Cys, Ser, or Thr residue which can undergo the trans-esterification 
reaction with the polypeptide to be immobilized. The peptide can have attached, generally at 

30 its caiboxyl terminus, a functional group such as those described above which can form a 
covalent linkage with a molecule that is attached to a surface. Alternatively, the peptide can 
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include a tag which can non-covalently associate with a molecule that is attached to a 
sur&ce. Suitable tags and respective binding partners are known to those of skill in the art, 
and several examples are described above. 

The polypeptides to be immobilized can be attached to the di-nucleophilic 
5 compoimds prior to, simultaneously with, or after the di-nucleophilic^compounds are 
attached to the surface. 

Methods of attaching molecules to different sur&ces are known to those of 
skill in the art In some embodiments, an organic thinfilm is employed to forms a layer 
eith^ on the substrate itself or on a coating covering the substrate, upon which each of the 

10 patches of polypeptides is immobilized. Organic thinfilms are described in copending US 
Patmt AppL No. 09/820210, filed March 27, 2001. A variety of difiFerent organic thinfihns 
are suitable for use in the present invention. Methods for the formation of organic thinfilms 
include m situ growth from the sur&ce, deposition by physisorption, spin-coating, 
chemisorption, self-assembly, or plasma-initiated polymerization firom gas phase. For 

15 instance, a hydrogel composed of a material such as dextran can serve as a suitable organic 
thinfihn on the patches of the array. In one preferred embodiment of the invention, the 
organic thinfihn is a lipid bilayer. In another preferred embodiment, the organic thinfilm of 
each of the patches of the array is a monolayer. A monolayer of polyarginine or polylysine 
adsorbed on a negatively charged substrate or coating is one option for the organic thinfilm. 

20 Another option is a disordered monolayCT of tethered polymCT chains. In a particularly 

preferred embodiment, the organic thinfilm is a setf-assembled monolayer. A monolayer of 
polylysine is one option fi)r the organic thinfilm. The organic thinfilm can be, for example, a 
self-assembled monolayer which comprises molecules of the formula X-R-Y, wherein R is a 
spacer, X is a fimctional group that binds R to the surface, and Y is a molecule that attaches 

25 to the polypeptide, or a moiety attached to the polypeptide. For example, Y can be the 

dinucleophilic compound which is used to attach the polypeptides onto the monolayer, or Y 
can be a binding partner for a tag that is attached to the polypeptide. 

In an alternative embodiment, the self-assembled monolayer is comprised of 
molecules of the formula pC)aR(Y)b where a and b are, independently, integers greater than 

30 or equal to 1 and X, R, and Y are as previously defined. In another alternative embodiment, 
the organic thinfilm comprises a combination of organic thinfilms such as a combination of a 
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lipid bilayer inunobilized on top of a self-assembled monolayer of molecules of the formula 
X-R-Y. As another example, a monolayer of polylysine can also optionally be combined 
with a self-assembled monolayer of molecules of the formula X-R-Y (see US Patent No. 
5.629,213). 

S In all cases, the coating, or the substrate itself if jio coating is present, must be 

compatible with the chemical or physical adsorption of the organic thinfilm on its sur£u:e. 
For instance, if the patches comprise a coating between the substrate and a monolayer of 
molecules of the formula X-K-Y, thm it is understood that the coating must be composed of 
a material for which a suitable functional group X is available. If no such coating is present, 

1 0 then it is understood that the substrate must be composed of a material for which a suitable 
functional group X is available. 

The methods of the invention can also be used with'trifimctional linkers such 
as are described in copending US Patent AppL No. 09/820210, filed March 27, 2001. TTiese 
linkers are useful for the site-specific introduction of a label to a polypeptide, in addition to 

15 the site-specific immobilization of a polypeptide to a soUd si;^porL These trifunctional 

crosslinking groups have, in some embodiments, the formula: 

l1 — Y 



wherein W is a trivalent core component; L\ L and L are indqiendentiy 
linking groups; X is a non-covalent polypeptide tag binder; Y is a photoactivatable covalent 
20 linking group; and Z is a protected or unprotected covalent crosslinking groiq>. In this 

particular exanq>le, a trifunctional linking groitp is d^icted having three functional groups 
(X, Y and Z) attached via linkers (L\iJ and L^) to a central core (W). The first functional 
group is one which provides a non-covalent association with a targeted polypeptide or a 
polypeptide of interest For example, the trifunctional linking group can form a non- 
25 covalent association complex with a polypeptide having a suitable tag (e.g., a his-tag). The 
second functional group can then establish a covalent linkage to the polypeptide at a site 
which is proximate to the initial non-covalent association site. One of skill in the art will 
appreciate that although the polypeptide is shown as a relatively small circle (relative to the 
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size of the trifunctional crosslinking group), in fact the polypeptide in most embodiments is 
quite large relative to the crosslinking group. Nevertheless, the site for coval^t attachment 
of functional groiq> Y will depend on the lengths and flexibility of the linking groups and 
L^. Typically, the site for covalmt attachment of Y to the polypeptide will be within about 
5 SO A of the site of non-covalent association. Release of the non-cov^ent functional groi^ 
(X) from the polypeptide provides a polypeptide having a covalently bound trifunctional 
crosslinking group. In subsequent steps, functional group Z of the polypeptide-crosslinking 
groiq> composition can be used, for exBxnplc^ to attach a suitable label to the polypeptide, or 
to immobilize the polypeptide on a suitable support. 

10 C Polypeptide Arrays 

The presmt invention provides arrays of polypeptides, as well as methods for 
synthesizing such arrays. TypicaUy, the polypeptide arrays comprise micrometer-scale, two- 
dimensional patterns of patches of polypeptides immobilized on a surface of the substrate. 
Polypeptide arrays and their use for high-throughput screening are described in, for example, 

15 co-pMiding US patent application Ser. Nos. 09/1 15,455, filed July 14, 1998; 09/353,21 5, 
filed July 14, 1999 and 09/353,555, filed July 14, 1999; and related PCT published 
^pUcations WO 00/04382, WO 00/04389 and WO 00/04390). 

In one mibodiment, the present invention provides an array of polypeptides 
which comprises a substrate, at least one organic thinfilm on some or all of the substrate 

20 surfece, and a plurality of patches arranged in discrete, known regions on portions of the 
substrate sur&ce covered by organic thinfilm, who^in each of said patches comprises a 
polypeptide immobilized on the imderlying organic thinfilm. 

In most cases, the array will comprise at least about ten patches. In a 
prefmred embodiment, the array comprises at least about 50 patches. In a particularly 

25 preferred embodiment the array comprises at least about 100 patches. In alternative 

preferred embodiments, the array of polypeptides can comprise more than 10^, 10^ or 10^ 
patches. 

The area of surface of the substrate covered by each of the patches is 
preferably no more than about 0.25 mm^. Preferably, the area of the substrate surface 
30 covered by each of the patches is between about 1 \xxd? and about 10,000 ^m^. In a 

particularly preferred embodiment, each patch covers an area of the substrate surface Scorn 
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about 100 fim^ to about 2,500 pm^. In an alternative embodiment, a patch on the array can 
cover an area of the substrate surface as small as about 2,500 nm\ although patches of such 
small size are generally not necessary for the use of the array. 

The patches of the array can be of any geometric shape. For instance, the 
5 patches can be rectangular or circular. The patches of the array can also be irregularly 
shaped 

The distance separating the patches of flie array can vary. Preferably, the 
patches of the array are separated fiom neighboring patches by about 1 )xm to about 500 pm. 
Typically, the distance separating the patches is roughly proportional to the diameter or side 

10 length of the patches on the array if the patches have dimoisions greater than about 10 |im. 
If the patch size is smaller, then the distance separating tiie patches will typically be larger 
than the dim^isions of the patch. 

In a preferred embodiment of the array, the patches of the array are all 
contained within an area of about 1 cm^ or less on the surface of tiie substrate. In one 

15 preferred embodiment of the array, therefore, the array con:q)rises 100 or more patches 
within a total area of about 1 cm^ or less on the surface of the substrate. Alternatively, a 
particularly preferred array comprises 10^ or more patches within a total area of about 1 cm^ 
or less. A preferred array can ev^ optionally comprise 10^ or 10^ or more patches within an 
area of about 1 cm^r less on the surface of the substrate. In other embodiments of tiie 

20 invention, all of the patches of the array are contained within an area of about 1 m^ or less on 
the surface of the substrate* 

Typically, only one type of polypeptide is immobilized on each patch of tfie 
array. In a preferred embodimmt of the array, the polypeptide immobilized on one patch 
differs from the polypeptide immobilized on a second patch of the same array. In such an 

25 embodiment, a plurality of different polypeptides are present on separate patches of the 
array. Typically the array comprises at least about ten different polypeptides. Preferably, 
the array comprises at least about 50 different polypeptides. More preferably, the array 
comprises at least about 100 different polypeptides. Alternative preferred arrays comprise 
more than about 10^ different polypeptides or more than about 10"* different polypeptides. 

30 The array can even optionally comprise more than about 10^ different polypeptides, 
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In one embodiment of the array, each of the patches of the array comprises a 
different polypeptide. For instance, an array conq>rising about 100 patches could comprise 
about 100 different polypq)tides. Likewise, an array of about 10,000 patches could 
comprise about 10,000 different polypeptides. In an alternative embodiment, however, each 
5 different polypeptide is immobilized on more than one separatej>atch on the array. For 

instance, each different polypeptide can optionally be present on two to six different patches. 
An array of the invention, therefore, can comprise about three-thousand polypeptide 
patches, but only comprise about one thousand different polypeptides since each different 
polypeptide is present on three different patches. 

10 In another embodiment of the present invention, although the polypeptide of 

one patch is different from tiaat of another, the polypq)tides are related. In a preferred 
embodiment, the two differrat polypeptides are members of the gaine polypeptide family. 
The different polypeptides on the invention array can be either functionally related or just 
suspected of being functionally related. In another embodiment of the invention array, 

15 however, the function of the immobilized polypeptides can be unknown. In this case, the 
different polypeptides on the different patches of the array share a similarity in structure or 
sequence or are simply suspected of sharing a similarity in structure or sequence. 
Alternatively, the immobilized polypeptides can be just fragments of different members of a 
polypeptide family. 

20 The polypeptides immobilized on the array of the invention can be members 

of a polypeptide frimily such as a receptor family (exan:q>les: growth factor receptors, 
catecholamine receptors, amino acid derivative receptors, cytokine receptors, lectins), ligand 
&mily (examples: cytokines, serpins), enzyme family (examples: proteases, kinases, 
phosphatases, ras-like GTPases, hydrolases), and transcription factors (examples: steroid 

25 hormone receptors, heat-shock transcription factors, 2inc-finger proteins, leucine-zipper 
proteins, homeodomain proteins). In one embodiment, the different inmiobilized 
polypeptides are all HTV proteases or hepatitis C virus (HCV) proteases. In other 
embodiments of the invention, the immobilized polypqitides on the patches of the array are 
all hormone receptors, neurotransmitter rec^tors, extracellular matrix receptors, antibodies, 

30 DNA-binding proteins, intracellular signal transduction modulators and effectors, c^optosis- 
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related factors, DNA synthesis factors, DNA r^air fectors, DNA recombination factors, or 
cell-surface antigens. 

In some embodiments, the polypeptide immobilized on each patch is an 
antibody or antibody fragment The antibodies or antibody fragments of the array can 
5 optionally be single-chain Fvs, Fab fragments. Fab' fi:agments,_F(ab^2 fragments, Fv 
firagments, dsFvs diabodies, Fc fragments, full-length, antigen-specific polyclonal 
antibodies, or fiill-length monoclonal antibodies. In a preferred embodiment, the 
immobilized polypeptides on the patches of the array are monoclonal antibodies. Fab 
firagments or single-chain Fvs. 
10 In another preferred embodiment of the invention, the polypeptides 

inmiobilized to each patch of the array are polypeptide-capture agents. 

In an altemative embodiment of the invention array, the polypeptides on 
di£f(^nt patches are identical. 

Biosensors, micromachined devices, and diagnostic devices that comprise the 
15 polypeptide arrays of the invention are also contemplated by the present invention. 

The physical structure of the polypq)tide arrays will typically comprise a 
substrate and, optionally, a coating or organic fliinfilm or both. 

The substrate of the array can be either organic or inorganic, biological or 
non-biological, or any combination of diese materials. In one embodiment, the substrate is 
20 transparent or translucent The portion of the sur&ce of the substrate on which the patches 
reside is preferably flat and firm or semi-firm. However, the array of the prevent invention 
need not necessarily be flat or entirely two-dimensional. Significant topological features 
can be present on the surface of the substrate surrounding the patches, between the patches 
or beneath the patches. For instance, walls or other barriers can separate the patches of the 
25 array. 

Numerous materials are suitable for use as a substrate in the array 
embodiment of the invention. For instance, the substrate of the invention array can comprise 
a matmal selected fix)m a group consisting of sihcon, silica, quartz, glass, controlled pore 
glass, carbon, alumina, titania, tantalum oxide, germanium, sihcon nitride, zeolites, and 
30 gaUiimi arsenide. Many metals such as gold, platinum, aluminum, copper, titanium, and 
their alloys are also options for substrates of the array. In addition, many ceramics and 
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polyxn^ can also be used as substrates. Polymers which can be used as substrates mclude, 
but are not limited to, the following: polystyreane; poly(tetra)fluoiDethylene (PTFE); 
polyvinylidenedifluoride; polycarbonate; polymethylmethacrylate; poljrvinylettiylene; 
polyethyleneimine; poly(etherether)ketone; polyoxymethylene (POM); polyvinylphenol; 
5 polylactides; polymethacrylimide (PMI); polyatkenesulfone (PAS); go^ropylene; 
polyethylene; polyhydroxyethylmefhacrylate (HEMA); polydimethylsiloxane; 
polyaciylamide; polyimide; and block-^opolymers. Preferred substrates for the array include 
silicon, silica, glass, and polymers. The substrate on which the patches reside can also be a 
combination of any of the aforementioned siibstrate materials. 

10 An array of the present invention can optionally further comprise a coating 

between the substrate and organic thinfiTm on the array. This coating can either be formed 
on the substrate or applied to the substrate. The substrate can be Modified with a coating by 
using tfain-film technology based, for example, on physical vapor deposition (PVD), fliermal 
processing, or plasma-enhanced chemical vapor dq)osition (PECVD). Alternatively, plasma 

IS exposure can be used to directly activate or alter the substrate and create a coating. For 
instance, plasma etch procedures can be used to oxidize a polymeric sur£ace (z.e., 
polystyrene or polyethylene to e?q30se polar functionalities such as hydroxyls, carboxylic 
acids, aldehydes and the like). 

The coating is optionally a metal film. Possible metal films include 

20 aluminum, chromium, titanium, tantalum, nickel, stainless steel, zinc, lead, iron, copper, 
magnesium, manganese, cadmium, tungsten, cobalt, and alloys or oxides thereof. In a 
preferred embodiment, the metal fibn is a noble metal film. Noble metals that can be used 
for a coating include, but are not limited to, gold, platinum, silver, and copper. In an 
especially preferred embodiment, die coating comprises gold or a gold alloy. Electron-beam 

25 evaporation can be used to provide a thin coating of gold on the surface of the substrate. In a 
preferred embodiment, the metal film is horn about 50 mn to about 500 nm in thickness. In 
an alternative embodiment, the metal film is from about 1 nm to about 1 |im in thickness. 

In alternative embodiments, the coating comprises a composition selected 
fiom the group consisting of silicon, silicon oxide, titania, tantalum oxide, silicon nitride, 

30 silicon hydride, indium tin oxide, magnesium oxide, alumina, glass, hydroxylated surfaces, 
and polymers. 

35 



wo 01/98458 



PCTmSOI/19531 



In one embodiment of the invention array, the sur&ce of the coating is 
atomically flat In this embodiment, the mean roughness of (he surface of the coating is less 
than about 5 angstroms for areas of at least 25 lun^. In a preferred embodiment, the mean 
roughness of the surface of the coating is less than about 3 angstroms for areas of at least 25 
5 fim^. Tlie ultraflat coating can optionally be a template-stripped sur&ce as described in 
Heguer et al. Surface Science, 1993, 291:39-46 and Wagner et al^ Langmuir, 1995, 
11:3867-3875, both of which are incorporated herdn by reference. 

It is contemplated that the coatings of many arrays will require the addition of 
at least one adhesion layer between said coating and the substrate. Typically, the adhesion 

10 layer will be at least 6 angstroms thick and can be much thicker. For instance, a layer of 
titanium or chromium can be desirable between a silicon wafer and a gold coating. In an 
alternative mibodiment, an epoxy glue such as Epo-tek 377®, ^C^ek 301-2®, (Epoxy 
Technology Inc., Billerica, Massachusetts) can be preferred to aid adherence of the coating 
to the substrate. Determinations as to what material should be used for the adhesion layer 

15 would be obvious to one skilled in the art once materials are chosen for both the substrate 
and coating. In other embodiments, additional adhesion mediators or interlayers can be 
necessary to improve the optical properties of the array, for instance, in waveguides for 
detection purposes. 

Deposition or formation of the coating (if present) on the substrate is 

20 performed prior to the formation of the organic thinfihu thereon. Several different types of 
coating can be combined on the surface. The coating can cover the whole surface of the 
substrate or only parts of it The pattern of the coating may or may not be identical to the 
pattern of organic thinfjlms used to immobilize the polypeptides. In one embodiment of the 
invention, the coating covers the substrate surface only at the site of the patches of the 

25 immobilized. Techniques useful for the formation ofcoated patches on the sur&ce of the 
substrate which are organic thinfilm compatible are well known to those of ordinary skill in 
the art. For instance, the patches of coatings on the substrate can optionally be fabricated by 
photolithography, micromolding (PCT Publication WO 96/29629), wet chemical or dry 
etching, or any combination of these. 

30 The organic thiniilm on which each of the patches of polypeptides is 

immobilized forms a layer either on the substrate itself or on a coating covering the 
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substrate. The organic thinfilm on which the polypq)tides of the patches are immobilized is 
preferably less than about 20 nm thick. In some embodiments f the invention, the organic 
tiiinfilm of each of iho patches can be less than about 10 nm thick. 

A variety of different organic thinfilms are suitable for use in the present 
S invention. Methods for the formation of organic thinfilms include in situ growth fix>m the 
sur&ce, deposition by physisorption, spin-coating, chemisoiption, self-assembly, or plasma- 
initiated polymerization from gas phase. For instance, a hydrogel composed of a matoial 
such as dextran can serve as a suitable organic thinfilTn on the patches of the array. In one 
preferred embodiment of the invention, the organic thinfilm is a lipid bilayer. In another 

10 preferred embodiment, the organic thinfilm of each of the patches of the array is a 

monolayer. A monolayer of polyarginine or polylysine adsorbed on a negatively charged 
substrate or coating is one option for the organic thinfilm. Another option is a disordered 
monolayer of tethered polymer chains. In a particularly preferred embodiment, the organic 
thinfilm is a self-assembled monolayer. A monolayer of polylysine is one option for the 

15 organic thinfilm. 

In all cases, the coating, or the substrate itself if no coating is present, must be 
compatible with the chemical or physical adsorption of the organic thinfilm on its surface. 
For instance, ifthe patches comprise a coating between the substrate and a monolayer of 
molecules of the formula I, then it is understood that the coating must be composed of a 

20 material capable of binding the trifimctional crosslinldng group of formula L If no such 
coating is present, then it is understood that the substrate must be composed of a material 
which can covalently bind the trifimctional crosslinldng group. 

In a preferred embodiment of the invention, the regions of the substrate 
surface, or coating surface, which separate the patches of polypeptides are free of organic 

25 thinfilm. In an alternative embodiment, the organic thinfilm extends beyond the area of the 
substrate smface, or coating surface if present, covered by the polypeptide patches. For 
instance, optionally, the entire surface of the array can be covered by an organic thinfilm on 
which the plurality of spatially distinct patches of polypeptides reside. An organic thinfilm 
which covers the entire surface of the array can be homogenous or can optionally comprise 

30 patches of differing exposed fiinctionalities usefiil in the immobilization of patches of 

different polypeptides. In still another alternative embodiment, the regions of the substrate 
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surface, or coating surface if a coating is present, between the patches of polypeptides are 
covered by an organic thinfilm, but an oiganic tbinfilni of a different type than that of the 
patches of polypeptides. For instance, ttie surfaces between the patches of polypeptides can 
be coated with an organic thinfilm characterized by low non-specific binding properties for 
5 polypeptides and other analytes. 

A variety of techniques can be used to generate patches of organic thiTifilm on 
the surface of the substrate or on the surface of a coating on the substrate. These techniques 
are well known to those skilled in flie art and will vary depmding upon the nature of the 
organic thmfilm, the substrate, and the coating if present The techniques will also vary 

1 0 depending on the structure of the underlying substrate and the pattern of any coating present 
on the substrate. For instance, patches of a coating which is higjily reactive with an organic 
thinfilm can have already been produced on the substrate surface; ^Arrays of patches of 
organic thinfilm can optionally be created by miciofluidics printing, microstamping (US 
Patent Nos. 5,512,131 and 5,731,152), or microcontact printing (p.CP) (PCT Publication 

15 WO 96/29629). Subsequent inunobilization of polypeptides to the reactive monolayer 

patches results in two-dimensional arrays of the agents. Inkjet printer heads provide another 
option for patterning monolayer molecules, or components thoreo^ or other organic thinfilm 
components to nanometer or naicrometer scale sites on the surface of the substrate or coating 
(LmnnoetaL,Anal Chem.^ 1997, 69:543-55 1; US Patent Nos. 5,843,767 and 5,837,860). 

20 In sonle cases, conomercially available arrayers based on cs^iUary dispensing (for instance, 
QnmiGrid™ fix>m Genemachines, inc, San Carlos, CA, and High-Throughput Microarrayer 
firom InteUigent Bio-Instruments, Cambridge, MA) can also be of use in directing 
components of organic tfainfitms to spatially distinct regions of the array. 

Diffusion boundaries between the patches of polypeptides immobilized on 

25 organic thinfilms such as self-assembled monolayers can be integrated as topographic 
patterns (physical barriers) or surface functionaUties with orthogonal wetting behavior 
(chemical barriers). For instance, walls of substrate material or photoresist can be used to 
separate some of the patches from some of the others or all of the patches from each other. 
Alternatively, non-bioreactive organic thinfilms, such as monolayers, with different 

30 wettability can be used to sq)arate patches from one another. 
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In some embodiments, the polypeptide species are attached to a chip that has 
a non-sample surface and a plurality of sample portions that are elevated with respect to the 
non-san^)le surface. Suitable chips, which are described in co-pending US Patent 
Application 09/792335, filed February 23, 2001, generally include an array of reactive 
5 surfaces on the tops of pillars of well-defined dimensions. The tops of the pillars consist of, 
or are coated with, an interface layer enable of binding or adsorbing, or reacting with 
molecules contained in the matmal in channels that are present in a dispenser, as described 
therein. The pillar walls in the base between the pillars are designed either by structural 
topography, material choice, or surface coatings, in such a way that they minimize or prevent 

10 liquid cross-contamination between the individual pillars during the transfer or reaction step 
when the dispenser and chip are engaged. Using the same design techniques, these areas of 
the chip are also made resistant to the adsorption of the molecules t>r materials to be 
transferred or reacted. Together, these design features will prevent contamination between 
the top surfaces of the pillars. Thus, the biochips includes a topogr^hical design wherein 

15 elevated surfiu:es or pillars are pmvided for isolating various materials and chemical 
reactions for observation and analysis. 

Microfiuid dispensers for providing materials in fluid form to the pillars are 
also described in US Patent Application 09/792335, filed February 23, 2001. The dispensers 
can be used to create a final biochip with materials on the pillars for later analysis or 

20 chemical reactions, can be used to create the chemical reactions, and can further be used to 
observe and analyze the chemical reactions. By using the disposer with a flow-cell adaptor 
that introduces analytes to the capture sites on top of the pillars, one can easily avoid non- 
specific binding of analytes on the sides of the pillars or the substrate between pillars. 
2>. Screening Methods 

25 Arrays of surface-attached polypeptide species that are obtained using the 

mdfaods of the invention are typically screened to idratify those that have a desired activity 
(e.^., binding aftinity to a target molecule of interest). Binding of a target molecule to the 
polyp^tides of the arrays can be detected in a number of methods known to those of skill in 
the art In one embodiment, fluorescent tags can be attached to known targets and bmding 

30 can be measured by detecting fluorescence. Alternatively, ellipsometry (see, e.g., Elwing, H. 
Biomaterials 19(4-5): 397-406 (1998); Werner, C. et al Int J, AHif, Organs 22(3):160-176 
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(1999); and Ostroff, KM, et al Clin. Chem. 45(9):1659-64 (1999)) or surface plasmon 
resonance spectroscopy (see e.g,j Miksich, M.;, et aL^ Langmair 1995, 4383; Miksich, M., et 
aL.J.Am. Chem,Soa 1995,117:12009; Sigal, G. B., etal.,Anal Chem. 1996, 68: 490) can 
also be used to detect binding events (e.g., on surfaces). These assays are particularly useful 
in detecting target molecules in con^lex mixtures such as blood or other bodily fluids. 

The present invention also provides transferring the target molecule to a 
reaction chamber(5) that, in one embodiment, provides solutions or condition (e.g. elevated 
temp^ture) that dissociates the target molecule fiom the affinity molecule. The target 
molecule can then be detected using, e.g., liquid chromatography mass spectrometry (see, 
e.g., Niessen,.W.M. J. Chromatogr. A, 856(l-2):179-97 (1999) and Maurer H.H. /. 
Chromatogr, B. Biomed. ScL ApplL 713(l):3-25 (1998)) or other methods known to those of 
skin in the art. 

Conventionally, new chemical entities with useful properties are generated by 
identifying a chemical compound (called a 'lead compound"*) with some desirable property 
or activity, cacea&ag variants of the lead compound, and evaluating the property and activity 
of ttiose variant compoimds. However, the cuiient trend is to shorten the time scale for all 
aspects of drug discovery. Because of the ability to test large niunbers quickly and 
efficiently, high througt^ut screening (HTS) methods are replacing conventional lead 
compound identification methods. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or moro 
assays to identify those libraiy members (particular chemical species or subclasses) that 
display a desired characteristic activity. The compounds thus identified can serve as 
conventional "lead compounds" or can themselves be used as potential or actual 
therapeutics. 

1. Combinatorial chemical libraries 

Recently, attention has focused on the use of combinatorial chemical libraries 
to assist in the generation of new chemical compoimd leads. A combinatorial chemical 
library is a collection of diverse chemical compounds generated by either chemical synthesis 
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or biological synthesis by combining a number of chemical ''building blocks" such as 
reagents. For example, a linear combinatorial chemical library such as a polypeptide library 
is formed by combining a set of chemical building blocks called amino acids in every 
possible way for a given compound length {te., the number of amino acids in a polypeptide 
S compound). Millions of chemical compounds can be synthesized through such 

combinatorial mining of chemical building blocks. For example, one commentator has 
observed that the systematic, combinatorial mixing of 100 interchangeable chemical building 
blocks results in the theoretical synthesis of 100 million tetrameric con:q)ounds or 10 billion 
pentameric compounds (Gallop era/. (1994) 37(9): 12331250). 

10 Preparation and screening of combinatorial chemical libraries are well known 

to those of skill in the art Such combinatorial chemical libraries include, but are not limited 
to, peptide libraries (see, eg., U.S. Patent 5,010,175, Furka (1991)Jii/. J. Pept Prot. Res., 
37: 487-493, Houghton et cd, (1991) Nature, 354: 84-88). Peptide synthesis is by no means 
the only approach envisioned and intended for use with the present invention. Other 

1 5 chemistries for generating chemical diversity libraries can also be used. Such chemistries 
include, but are not limited to: peptoids (PCT Publication No WO 91/19735, 26 Dec. 1991), 
encoded peptides (PCT Publication WO 93/20242, 14 Oct. 1993), random biooligomers 
^CT PubUcation WO 92/00091, 9 Jan. 1992), benzodiazepines (U.S. Pat No. 5,288,514), 
diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs et aL, (1993) Proc. 

20 Nat Acad, ScL USA 90: 69096913), vinylogous polypeptides ^lagihara et aL (1992) J. 
Amen Chenu Soc 1 14: 6568), noiq)eptidal pepttdomimetics with a Beta D Glucose 
scaffolding (ECrschmann et aL, (1992) X Amen Chem. Soc. 1 14: 92179218), analogous 
organic syntheses of small compound libraries (Chen et aL (1994) 7. Amen Chem. Soc. 1 16: 
2661), oligocarbamates (Cho, et al., (1993) Science 261:1303), and/or peptidyl phosphonates 

25 (CampbeU et aL. (1994) J. Org. Chem. 59: 658). See, generally, Gordon et al., (1994) J. 
Med. Chem. 37:1385, nucleic acid libraries, peptide nucleic acid libraries {see, e.g., U.S. 
Patent 5,539,083) antibody libraries {see, e.g., Vaughn et al. (1996) Nature Biotechnology, 
14(3): 309-314), and PCT/US96/10287), carbohydrate Hbraries {see. e.g., Liang et aL (1996) 
Science, 274: 1520-1522, and U.S. Patent 5,593,853), and small organic molecule libraries 

30 (see, e.g„ benzodiazepines, Baum (1993) C&EN, Jan 18, page 33, isoprenoids U.S. Patent 
5,569,588, thiazolidinones and metathiazanones U.S. Patent 5,549,974, pyrrolidines U.S. 
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Patents 5,525,735 and 5,519,134, moipholino compounds U.S, Patent 5,506^37, 
benzodiazepines 5,288,514, and the like). 

Devices for the pr^aration of combinatorial libraries are commercially 
available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY^ Symphony, 
5 Rainin, Wobum, MA, 433 A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systMis have also been developed for 
solution phase chemistries. These systems include automated woricstations like the 
automated syndiesis ^>paratus developed by Takeda Chemical Industries, LTD. (Osaka, 

10 J^an) and many robotic systems utilizing robotic arms (Zymate n, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett Packard, Palo Alto, Calif) which mimic the manual 
synthetic operations p^ormed by a chemist Any of the above devices are suitable for use 
with the present invention. The nature and implemmtation of modifications to these devices 
(if any) so that they can operate as discussed herein will be sqiparent to persons skilled in the 

15 relevant art In addition, numerous combinatorial libraries are th^nselves commercially 
available (see, e,g„ ComGenex, Princeton, NJ,, Asinex, Moscow, Ru, Tripos, Inc., St 
Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.), 

2, Hig^ throughput assays of chemical libraries 

20 A variety of Bsssys can be used to measure the interaction of different . _ 

molecular conq>onents, e.g., to identify compounds that bind or iiihibit gene products or that 
interact with a specific molecule. High throughput assays for the presence, absence, or 
qiiantification of particular nucleic acids or polypeptide products are well known to those of 
skill in the art Similarly, binding assays are similarly well known. Thus, for exanq>le, U.S. 

25 Patent 5,559,410 discloses high throughput screening methods for polypeptides, U.S. Patent 
5,585,639 discloses high througlqiut screening methods for nucleic acid binding (z.e., in 
arrays), while U.S, Patents 5,576,220 and 5,541,061 disclose high throughput methods of 
screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 

30 {see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
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Instrumrats, Inc. Fullerton, CA; Precision S)^ems, Inc., Natick, MA, etc.). These systems 
typically axitomate entire procedures including all sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide higji throughput and r^id start up as well 
5 as a high degree of flexibility and customization. The manufacturers of sxich systems 

provide detailed protocols the various high throughput Thus, for example, Zymark Coip. 
provides technical bulletins describing screening systems for detecting the modulation of 
gene transaction, ligand binding, and the like. 

A discussion of the above technology and other relevant aspects of 

1 0 technology related to flie present invention can be found in PCT Publication No. WO 
200004382, entitled Arrays Cf Proteins And Methods Of Use Thereof, Wagner, P. et al.; 
PCT Publication No. WO 200004389, w&Med Arrays OfProtein-Gapture Agents And 
Methods Of Use Thereof Wagner, P. et al.; and PCT Publication No. WO 200004390 
entitled Micro Devices For Screening Biomolectdes, Wagner, P. et al. 

15 E. Kits 

The present invention further provides for kits to be supplied to end users for 
attaching polypeptides described herein to surfaces of substrates in a manner as provided by 
the methods herein disclosed. Kits may supply reagents including, for example, anchor 
molecule reagents, activating compoimds and agents for activating polypeptide esters or 

20 thioesters, or for activating con:q)onents, including surface attachment functional groups 
orthogonally from anchor molecule/polypeptide ligation groups, substrates, including 
substrates pre-derivatized with anchor molecules and/or substrates ready to receive anchor 
molecules, and instructions. 

Other embodiments of kits include providing polypeptides containing an ester 

25 or thioester along conq}onents including instructions, anchor molecules, substrates, anchor 
molecule derivatized substrates, or where the polypeptide has been modified with the anchor 
molecule. 

It is understood that the exanq)les and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
30 suggested to persons skilled in the art and are to be included within tiie spirit and purview of 
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this q}plication and scope of the upended claims. All publications, patrats, aad patent 
applications cited herein are hereby incorporated by reference for all purposes. 
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1 WHATIS CLAIMED IS: 

1 1 . A method for immobilizing a polypeptide to a sur£ace» wherein the 

2 method con^irises : 

3 contacting a polypeptide which comprises an ester or ^oester, wifli an 

4 anchor molecule con4}rising a first nucleophiHc groi^ at a 2 or 3 position relative to a 

5 second nucleophilic groiip, 

6 wherein the ester or thioester midergoes a trans-esterification reaction 

7 with the first nucleophilic group, thus forming an intermediate 

8 compound in which the polypeptide is attached to the anchor molecule 

9 tiirough the first nucleophilic group; and 
1 0 attaching the anchor molecule to a surface. 

1 2, The method of claim 1 , wherein the intermediate compound 

2 undergoes an intramolecular rearrangement in which the second nucleophilic group on the 

3 anchor molecule displaces the first nucleophilic group, tiius forming a more stable bond 

4 between the anchor molecule and the polypeptide. 

1 3. The method of claim 1, wherein the polypeptide conqmse a thioester. 

1 4, The method of claim 1 , wherein the anchor molecule comprises a 2- 

2 aminonucleophile or a 3-aminonucleophile. 

1 5. Themethodofclaim4, whereinthe2-aminonucleophileisa2- 
2 aminothiol. 

1 6. The method of claim 5, wherein the anchor molecule comprises a 

2 stnicture selected fccm the group consisting of; 
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3 




1 7. The method of claim 1, wherein the anchor molecule is attached to the 

2 smf ace prior to contacting the anchor molecule with the polypeptide. 

1 8. The method of claim 1, wherein flie anchor molecule is attached to the 

2 sur&ce after contacting the anchor molecule with the polypq^tide. 

1 9. The method of claim 1, wherein the anchor molecule comprises a 

2 functional group that can be covalently linked to a molecule tiiat is attached to the surface. 

1 10. The method of claim 9, wherein the functional group is selected from 



2 the group consisting of ketones, diketones, olefins, qjoxides, aldehydes, reactive esters, 

3 isocyanates, thioisocyanates, carboxylic acid chlorides, disulfides, sulfonate esters, 

4 maleiinide, isomaleimide, N-hydroxysuccinimide, nitrilotriacetic acid, activated hydroxyl, 

5 haloacetyl, activated carboxyl, hydraade, epoxy, aziiidine, sulfonylchloride, acyl 

6 hydrazines, trifluoromcthyldiaziridine, pyridyldisulfide, N-acyl-imidazole, 

7 imidazolecaibamate, vinylsulfone, succinimidylcaibonate, arylazide, anhydride, 

8 diazoacetate, benzophenone, isothiocyanate, isocyanate» imidoester, aminooxy and 

9 fluorobenzene. 



1 11. The method of claim 1 , wherein the anchor molecule comprises a tag 

2 moiety that can be noncovalently bound to a molecule that is attached to the surface. 

1 12, The method of claim 1 1, wherein the tag comprises a binding domain 

2 which is derived fi:om a polypeptide selected firom the group consisting of glutathione-S- 

3 transferase (GST), maltose-binding protein, chitin, cellulase, tfaioredoxin, avidin, 

4 streptavidin, and green-fluorescent protein (GFP). 
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1 13. The method of claim 11, wherein the tag comprises a chitin binding 

2 domain or a cellulose binding domain. 

1 14. The method of claim 1 1 » wherein the tag comprises a pq)tide that 

2 comprises an amino-terminal Cys» Thr, or Ser. 

1 15. The method of claim 1 , wherein the polypeptide comprises a non- 

2 natural amino acid. 

1 16. The method of claim 1 , wherein the ester or thioester is chemically 

2 introduced onto the polypeptide. 

1 17. The method of claim 1 , wherein the ester or thioester is introduced 

2 onto the polypeptide by chemical synthesis of the polypeptide. 

1 18. The method of claim 1 , whmin the polypeptide that comprises an 

2 ester or thioester is obtained by: 

3 expressing a chimeric gene that encodes a fusion protein which comprises: 

4 the polypeptide and an intein, or a functional portion thereof, which is joined 

5 to the polypeptide at a splice junction at the andno terminus of the intein, wherein the 

6 carboxyl terminus of the intein lacks a functional splice jimction; and 

7 contacting the fusion protein with a nucleophilic compound which releases 

8 the polypeptide from the intein at the splice jimction and forms the polypi^tide that 

9 comprises a terminal ester or thioester. 

1 19. The method of claim 1 8, wherein the nucleophilic compound is the 

2 anchor molecule. 

1 20. The method of claim 1 8, v/berein the nucleophilic compoimd 

2 comprises a peptide. 

1 21. The method of claim 20» wherein the pepdde comprises a serine, 

2 threonine or cysteine at its amino terminus, the oxygen and sulfur of which are the 

3 nucleophilic groups that undergo the transesterification reaction. 
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1 22. The method of claim 1 8, wherem the nucleophilic compoimd 

2 comprises a thiol as the nuclepphile. 

1 23. The method of claim 1 8, wherein the intein is an Int-n of a spKt intein 

2 and the anchor molecule conq)rises an amino acid sequence tiiat comprises an Int-c of a split 

3 intein» wherein the Int-n and the Int-c undergo an intein splicing reaction, thus attaching the 

4 anchor molecule to the polypeptide. 

1 24. The method of claim 23, wherein the Int-n is derived Gx>m a dnoE-n 

2 gene and the Iht-c is derived from a dnoE-c gene. 

1 25. The method of claim 24, wherein the J/rofi^n gene and the <//ia£^c 

2 gene are from a cyanobacterium species. 

1 26. The method of claim 25, wherein the cyanobacterium species is a 

2 Synechocystis species. 

1 27. The method of claim 1 8, wherein the fusion protein is expressed in 

2 vitro. 

1 28. The method of claim 1 8, wherein the frision protein is e^qiressed in 

2 vivo by introducing the chimeric gene into a host cell and incubating the host cell under 

3 conditions conducive to repression of the frision protein. 

1 29. The method of claim 1 , wherein the surface comprises a biochip. 

1 30. The method of claim 29, wherein the biochip comprises a non-sanq)le 

2 surface and a plurality of sample portions that are elevated with respect to the non-sample 

3 surface and each sample portion has attached thereto a single polypeptide species. 

1 31. The method of claim 29, wherein the biochip comprises one or more 

2 materials selected from the group consisting of silicon, plastic, gold, and glass. 

1 32. The method of claim 1 , wherein the surface comprises a microparticle. 
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1 33 . The method of claim 1 , wherein the polypeptide is placed in contact 

2 with the surface using a microvolume dispenser that comprises: 

3 a body; and 

4 at least one vertical channel defined within die body, the channel being 

5 defined by at least one passive valve; 

6 wherein an interior surface defining at least one vertical channel is 

7 hydrophobic. 

1 34. The method of claim 33, wherein the dispenser comprises a plurality 

2 of vertical chaimels defined within the body. 

1 35. The method of claim 34, wherein die vertical channels are arranged as 

2 an array. 

1 36. An array ofimmobilized polypeptides attached to a surface, wherein 

2 the array comprises at least a first polypeptide species and a second polypeptide species and 

3 each of which polypeptide species are: 

4 attached to a separate region of the surface; 

5 attached to the surface in the same orientation; and 

6 are folded in a secondary structure as required for a biological activity, 

1 37. The array of claim 36, wherein each of the peptide species are 

2 covalaitly attached to a surface-bound link^' by a 2-aminonucleophile ester bond, 

1 38. The array of claim 37, wharein the 2-aminoniLcleophile ester bond is a 

2 2-aminothioester bond. 

1 39. The array of claim 3 7, wherein the 2-aniinonucleophile ester bond 

2 undergoes an intramolecular rearrangement to form an amide bond. 

1 40. The array of claim 3 7, wherein the linker is a non-peptide linker. 

1 41. The array of claim 36, wherein the C-terminus of each of the 

2 polyp^tides is attached to the surface. 
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1 42. The array of claim 37, wherein the linker comprises a structure 

2 selected 6om fhe group consisting of: 



3 • • . 

1 43. The array of claim 36, wherein the surface comprises a biochip. 

1 44. The array of claim 43, wherein the biochip 5omprises a non-sample 

2 sur&ce and a plurality of sample portions that are elevated with respect to the non-sample 

3 surface and each sample portion has attached thereto a single polypeptide species. 

1 45. The array of claim 43, wherein the biochip comprises one or more 

2 materials selected from the group consisting of siUcon, plastic, gold, and glass. 

1 46. An array of immobilized polypeptides attached to a sur&ce which 

2 comprises a plurality of sur&ce regions, wherein each surface region has attached thereto a 

3 polypeptide species and a polynucleotide that encodes the polypeptide species. 

1 47. The array of claim 46, wherein the surface comprises a biochip. 

1 48 . The array of claim 47, wherein the biochip comprises a non-sample 

2 surface and a plurality of sample portions that are elevated with respect to the non-sample 

3 surface and each sample portion has attached thereto a single polypeptide species and a 

4 polynucleotide that encodes the polypeptide species. 

1 49. The array of claim 47, whereio the biochip comprises one or more 

2 materials selected from the group consisting of silicon, silicon oxide, plastic and glass. 

1 50. A method for screening a library of nucleic acids to identify a nucleic 

2 acid that encodes a polypeptide having a desired activity, the method comprising: 
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3 expressing a plurality of fusion proteins, each of which is encoded by an 

4 expression cassette that comprises: 

5 a) a member ofthe library of nucleic acids; 

6 b) an intein coding region; and 

7 c) an open reading frame that encodes a polypeptide that is displayed on 

8 a surface of a replicable genetic package; 

9 wherein the fusion proteins are displayed on the sur&ce of a replicable 

10 genetic package; and 

1 1 screening the replicable genetic packages to idmtify those that display a 

12 polypeptide having the desired activity. 

1 51. The method of claim SO, wherein the polypeptide encoded by the 

2 library member is released from the fusion protein by contacting £e phage with a 

3 nucleophilic compoimd, which nucleophilic compound becomes attached to the polypeptide. 

1 52. The method ofclaim 51, wherein the nucleophilic compound 

2 comprises a compoimd that has a first nucleophilic group and a second nucleophilic group at 

3 a 2 or 3 position relative to the first nucleophilic group. 

1 53 . The method of claim 52, wherein the nucleophiUc compound is a 2- 

2 aminonucleophile or a 3-aminonucleophile. 

1 54. The method ofclaim 53, wherein the nucleophiUc compound is a 2- 

2 aminothiol or a 3-aminothiol. 

1 55. The method of claim 51, wherein the nucleophiUc compoimd 

2 comprises a thiol or a hydroxyl. 

1 56. A nucleic acid that comprises an expression cassette, wherein the 

2 expression cassette comprises: 

3 an insertion site at which a polynucleotide can be introduced into the 

4 expression cassette; 
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5 an intein coding region, wherein the carboxyl terminus of the intein coding 

6 region is mutated s that it does not function as a splice junction for intein-mediated 

7 cleavage; and 

8 an open reading frame that encodes a polypeptide that is displayed on a 

9 surfELce of a replicable genetic package; 

10 wherein the introduction of a polynucleotide at the insertion site results in an 

1 1 open reading frame that encodes a fusion protein which comprises a polypeptide encoded by 

12 the pol}mucleotide> which polypeptide is attached at its caifooxyl terminus to an amino 

13 teminus of the intein, and &e surface-displayed polypeptide is attached to a carboxyl 

14 terminus of the intein. 

1 57. The nucleic acid of claim S6, wherein the expression cassette further 

2 comprises a promoter. 

1 58. The nucleic acid of claim 56, wherein the polynucleotide is a member 

2 of a library of polynucleotides. 

1 59. The nucleic acid of claim 58, wherein the library of polynucleotides is 

2 a library of cDNA molecules, genomic DNA fragments, or recombination products. 

1 60. A method for immobilizing a polypeptide to a surface, wherein the 

2 method comprises: 

3 contacting a polypeptide which con:q)iises an ester or thioester, with an 

4 anchor molecule comprising a first nxicleophilic group at a 2 or 3 position relative to a 

5 second nucleophilic group, 

6 wherein the ester or thioester imdergoes a trans-esterification reaction 

7 with the first nucleophilic group, thus forming an intermediate 

8 compound in which the polypeptide is attached to the anchor 

9 molecule through the first nucleophilic group; 

10 . wherein said intermediate compound undergoes an intramolecular 

1 1 rearrangement in which the second nucleophilic group on the 

12 anchor molecule displaces the first nucleophilic group, thus 
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13 fonning a bond betwem the anchor molecule and the 

14 polypeptide; and 

1 5 attaching the anchor molecule to a surface. 

1 61 . A method for immobihzing a polypeptide to a surface, wherein the 

2 method conq)rises: 

3 contacting a polypeptide which comprises an ester or thioester, with an 

4 anchor molecule comprising a reactive group selected from the group consisting of a NH2- 

5 NH-R group and an aminooxy group 

6 wherein R represents an anchor molecule, 

7 wherein the ester or thioester reacts with the reactive groi^, thus 

8 forming a compound comprising a polypeptide attached to the 

9 anchor molecule through the reactive group. 

1 62. Themethodofclaim61, wherein the polypq)tide that comprises an 

2 ester or a thioester are obtained by: 

3 expressing a chimeric gene that encodes a fusion protein which con:q)rises: 

4 the polypeptide; and 

5 an intein, or a functional portion thereof, which is joined to the polypeptide at 

6 a splice jimction at the amino terminus of the intein, wherein the carboxyl terminus of the 

7 intein lacks a functional splice junction; and 

8 contacting the fusion protein with a nucleophilic compound which releases 

9 the polypeptide ftom the intein at the splice junction and forms the polypeptide that 
10 comprises a terminal ester or thioester. 

1 63 . The method of claim 62, wherein the nucleophilic compound is the 

2 anchor molecule. 

1 64. The method of claim 62, wherein the nucleophilic cozzq)ound 

2 comprises a peptide. 

1 65. The method of claim 64, wherein the peptide comprises a serine, 

2 threonine or cysteine at its amino terminus. 



53 



wo 01/98458 



PCT/lISOl/19531 



1 66, The method of claun 62, wherein the nucleophilic compound 

2 comprises a thiol as the nucleophile, 

1 67. The method of claim 61, wherein the anchor molecule is attached to 

2 the surface after contacting the anchor molecule with the polypeptide. 

1 68. The method of claim 61, wharein the anchor molecule comprises a 

2 functional groiq) that can be covalently linked to a molecule that is attached to the surface. 

1 69. The method of claim 68, wherein the functional group is selected from 

2 the groiq) consisting of ketones, diketones, olefins, epoxides, aldehydes, reactive esters, 

3 isocyanates, thioisocyanates, carboxylic acid chlorides, disulfides, sulfonate esters, 

4 maleimide, isomaleimide, N-hydroxysuccinimide, nitrilotriacetio acid, activated hydroxyl, 

5 haloacetyl, activated carboxyl, hydrazide, epoxy, aziridine, sulfonylchloride, acyl 

6 hydrazines, trifluoromethyldiaziridine, pyridyldisulfide, N-acyl-imidazole, 

7 imidazolecaibamate, vinylsulfone, succinimidylcarbonate, arylazide, anhydride, 

8 diazoacetate, benzophenone, isothiocyanate, isocyanate, imidoester, aminooxy and 

9 fluorobenzene. 

1 . 70. The method of claim 6 1 , wherein the anchor molecule comprises a tag 

2 moiety that can be noncovaiently bound to a molecule that is attached to the surface. 

1 71. The method of claim 70, wherein the tag comprises a binding domain 

2 which is derived firom a polypeptide selected firom the group consisting of glutathione-S- 

3 transferase (GST), maltose-binding protein, chitin, cellulase, thioredoxin, avidin, 

4 strq)tavidin, and green-fluorescent protein (GFP). 

1 72. The method of claim 70, wherein the tag comprises a chitin binding 

2 domain or a cellulose b.inding domain. 

1 73 . The method of claim 70, wherein the tag comprises a peptide that 

2 comprises an amino-terminal Cys, Thr, or Ser. 



54 



wo 01/98458 



PCT/USOl/19531 



1 74. The method of claim 6 1 , wherein the polypeptide con^)nses a non- 

2 natural amino acid. 

1 75. The method of claim 6 1 , wherein the ester or thioester is chemically 

2 introduced onto the polypeptide. 

1 76. The method of claim 6 1 , wh^ein the ester or thioester is introduced 

2 onto the polypeptide by chemical synthesis of the polypeptide. 

1 77. A kit for use in immobilizing one or more polypeptides containing an 

2 ester or thioester to a surface of a substrate comprising: 

3 an anchor molecule reagent for ad^ting said ester or thioester containing 

4 polypeptide to said surface, 

5 wherein said anchor molecule comprises a first nucleophilic group at a 

6 2 or 3 position relative to a second nucleophilic group, 

7 wherein the ester or thioester of said one or more polypq}tides 

8 undergoes a trans-esterification reaction with the first nucleophilic group, thus forming an 

9 intermediate compound in which the polypeptides are attached to the anchor molecules 

1 0 ttuDU^ the first nucleophilic group^ 

1 1 wherein said anchor molecule is adapted for attachment to said 

12 sur&ce of said substrate. 

1 78. The kit of claim 77 fiirther comprising: 

2 a DNA vector for introducing said ester or thioester into said polypeptide, 

3 said vector being adapted to receive a nucleic acid sequ»ce encoding said polypeptide to 

4 form a ester or thioester polyp^tide e?^ression vector for expressing said polypeptide as an 

5 ester or thioester polypeptide having said ester or said thioester incorporated therein. 

1 79. The kit of claim 77 further con5)rising: 

2 a chemical agent for introducing into said polypeptide an ester or thioester. 

1 80. Thekit of claim 77 fiirther comprising: 

2 instructions for instmcting a user to carry out the method of claim 1 using 

3 said kit. 
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1 81 , The kit of claim 77 further comprising: 

2 a substrate for attaching said anchor molecules thereto for immobilizing said 

3 polypeptides thereon. 

1 82. The kit of claim 81, wherein said anchor molecule is siqyplied attached 

2 to said surface of said substrate for later attaching said polypeptide thereto by a user. 

1 83 . The kit of claim 77, wherein said polypeptides are supplied wifli said 

2 kit 

1 84. The kit of claim 83, wherein said polypeptides are si^plied with said 

2 kit pre-coupled with said anchor molecuie(s). 

1 85. The method ofclaim 77, wherein said substrate comprises a 

2 microparticle. 
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